MU< -H: j£POS\TOR 5 JUL 19^ NflVflL RfSfflRCH LOGISTICS OUflRTfRLy MARCH 1978 VOL. 25, NO. 1 OFFICE OF NAVAL RESEARCH NAVSO P-1278. t/on-A NAVAL RESEARCH LOGISTICS QUARTERLY EDITORIAL BOARD Marvin Denicoff, Office of Naval Research. Chairman Ex Officio Members Murray A. Geisler, Logistics Management Institute W. H. Mariow, The George Washington University Bruce J. McDonald, Office of Naval Research Tokyo Thomas C. Varley, Office of Naval Research Program Director Seymour M. Selig, Office of Naval Research Managing Editor MANAGING EDITOR Seymour M. Sehg Office of Naval Research A rlington, Virgin ia 22217 ASSOCIATE EDITORS Frank M. Bass, Purdue University Jack Borsting, Naval Postgraduate School Leon Cooper, Southern Methodist University Eric Denardo, Yale University Marco Fiorello, Logistics Management Institute Saul I. Gass, University of Maryland Neal D. Classman, Office of Naval Research Paul Gray, University oj Southern Calijorjiia Carl M. Harris, Syracuse University Arnoldo Hax, Massachusetts Institute of Technology Alan J. Hoffman, IBM Corporation Uday S. Karmarkar, University of Chicago Paul R. Kleindorfer, University of Pennsylvania Darwin Klingman, University of Texas, Austin Kenneth O. Kortanek, Carnegie-Mellon University Charles Kriebel, Carnegie-Mellon University Jack Laderman, Bronx, New York Gerald J. Liebermdn, Stanford University Clifford MarshaU, Polytechnic Institute of New York John A. Muckstadt, Cornell University William P. Pierskalla, Northwestern University Thomas L. Saaty, University of Pennsylvania Henry Solomon, The George Washington University Wlodzimierz Szwarc, University of Wisconsin, Milwaukee James G. Taylor, Naval Postgraduate School Harvey M. Wagner, The University of North Carolina John W. Wingate, Naval Surface Weapons Center, White Oc Shelemyahu Zacks. Case Western Reserve University The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics ar will publish research and expository papers, including those in certain areas of mathematics, statistics, and economic relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. Information for Contributors is indicated on inside back cover. The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, Jun September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printir Office, Washington, D.C. 20402. Subscription Price: $11.15 a year in the U.S. and Canada, $13.95 elsewhere. Cost c individual issues may be obtained from the Superintendent of Documents. The views and opinions expressed in this Journal are those of the authors and not necessarily those of the Offic of Naval Research. Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regulation P-35 (Revised 1-74). ESTIMATING RELIABILITY GROWTH (OR DETERIORATION) USING TIME SERIES ANALYSIS* Nozer D. Singpurwalla Department of Operations Research The George Washington University Washington, D.C. ABSTRACT In this paper we propose a method for estimating rehabihty growth (or de- terioration) using time series analysis. Our method does not call for the specifica- tion of a particular model, and estimates the growth in the presence of periodicity. We illustrate our procedure by considering some binomial failure data generated during the testing of a large system of the U.S. Navy. 1. INTRODUCTION It is well known that the reliability of a newly developed system changes with time. This may be due to engineering, design, and/or other changes which the system is subjected to. This time- dependent process is known as "reliability growth," and many a paper discussing its estimation (under various hypotheses) has appeared in the literature (see Donelson [9] for a recent compre- hensive survey). Though the term reliability growth implies that reliability increases with time, there are instances wherein the reliability deteriorates with time, at least in the initial stages. This can possibly be attributed to unsatisfactory and/or to poorly conceived changes to the system. In this paper, we propose a method to estimate future reliability by considering the changes in reliability as a time series process. We emphasize that the scope of this paper is not restricted to merely estimating the increase in reliability. Once formulated as a time series process, a deteriora- tion of reliability together with periodic patterns in it can also be estimated. In principle, our method is based on a theme proposed by Singpurwalla [16] for the time series analysis of the esti- mated failure rate function (see also Rice and Rosenblatt [13] for a related discussion). To the best of our knowledge, for an analysis of data on reliability growth, an approach of this type has not been considered before. The recent models for reliability growth considered by Crow [8] and by Donelson [9] are different in spirit than ours. Instead of considering binomial trials, they treat situations where successive times between failures are available and hence, use point process models/techniques. ♦Research supported by the Office of Naval Research, Contract N00014-75-C-0729 and by the Aerospace Research Laboratories, Contract F33615-74-C-4040. 2 N. D. SINGPURWALLA Our interest in considering the topic of reliability growth is based on a problem pertaining to the time behavior of the reliability of a large system of the U.S. Navy. Consequently, we are able to discuss our approach via a realistic situation, but by using camouflaged failure data. 2. PRELIMINARIES In this paper, we shall consider the following situation. On a complex system which is chrono- logically undergoing developmental changes, tests are performed to monitor progress and to deter- mine whether reliability requirements are being met. The outcome of each test is judged to be either a success or a failure. In particular, at the end of the jth stage, A^"^ independent tests are conducted of which r^ are deemed to be successful. If we denote the reliability of the system at the end of the jth stage by pj, then rj is binomially distributed with parameters A^^^ and p,. Let p^ be an estimator of P),j=l, 2, . . ., M. Given a chronological history of tests performed on the system during the M stages, our objective is to ascertain reliability growth or deterioration and to forecast the values of pj, j=M+l, M-\-2, . . .. Since the reliability of the system is dependent on time, it is reasonable to construe the sequence {Pj]i^ as the realization of a time series process. Once the underlying process has been identified and its parameters estimated, forecasts of the future reliability together with probability limits on these forecasts can be readily obtained. In order to implement this approach, we suggest using the Box-Jenkins [4] technique of time series analysis. In using this technique, we implicitly assume that the true values of reliability (either transformed or untransformed) at consecutive steps are related by an autoregressive integrated moving average (ARIMA) model with normally distributed residuals. We would like to point out that while the methods of this paper can be applied to a wide range of reliability growth problems, they may not always be the most appropriate ones to use. Alternate methods for analyzing such data emanate from the Fourier analysis of time series, and these are discussed in a recent book by Bloomfield [2]. We remark that the identification of a posi- tive trend term would imply reliability growth, whereas the identification of a negative trend term would imply a deterioration of reliability. A princij^al advantage of such an approach is that it does not call for a specification of a particular model for reliability growth, as has been done in the past. A disadvantage is that it calls for reliability data to be taken at a large number of stages. Another useful feature of our technique is that in conjunction with the time series interpretation, it allows the flexibility of incorporating deterministic inputs such as engineering judgments or mana- gerial interventions into the analysis. This is elaborated upon in section 3.2. 3. CHOICE OF ESTIMATORS FOR THE RELIABILITY Following the discussion given in section 2, we note that the available data consist of the se- quence {Nj,rj}i^. The reliability growth postulate requires that pt>Pj, for t^j, i, j=l, 2, . . ., M. As stated before, this postulate is an idealization which in practice may not be true. Since the estimators of pt, pi (t=l, 2, . . ., M), may or may not satisfy the growth postulate, the traditional approach has been to average adjacent estimates whenever a reversal occurs. Thus, for example, if the estimates for the Arth and the (A:-hl)st stages are reversed, that is, if Pk+\<iVk, then estimators RELIABILITY GROWTH USING TIME SERIES 3 for pk+i and pn which satisfy the growth postulate are taken as Pk=Pk+i = (.Pk+i+Pic)/2. This procedure is repeated until mono tonicity of the estimators is achieved. For the purposes of this paper, we do not make this assumption. We shall take our sequence of estimators as they appear, bearing in mind that if it is desirable to satisfy the growth postulate, then the averaging procedure described above can still be used. In the latter case, the time series model will clearly differ from that produced by the case wherein the growth assumption was not made. We now discuss some estimators for {pj}f^ upon which a time series analysis would be performed. 3.1. Maximum likelihood estimator A . A natural choice for Pj is the maximum likelihood estimator rJN], j=l, 2, . . ., M. For some applications, this estimator might not be suitable. This is particularly true when N, is small or when Vj is either close to zero or close to Nj. In such cases, Pj is either or 1 (or close to or 1) and one is faced with the problem of fitting a time series to binary data. The situation described above is not very uncommon with data pertaining to reliability growth. This is because in the initial stages, r^ is apt to be close to zero, and in the latter stages, r^ is apt to be closer to A^;. This phenomenon is also true for the data considered in this paper. Barlow and Scheuer [1] get around this drawback by classifying the failures into two groups and by hypoth- esizing a trinomial model for testing at each state. As an alternative to the above, we can accumulate the r, and the A^^ and use a "smoothed" estimator of pj, say p^, where (1) Pi=±,rJ±.N,,j=\,2,...,M. i=l / i=l An observation which is prompted by the estimator (1) is that the smoothing induces a high correlation between estimates for successive stages, especially as _;' increases. In view of this, an interpretation of the series {Pj}^ has to be undertaken with some care. A strategy for reducing the effect of the induced correlation is to smooth over the preceding k values, where k is small, rather than smoothing over all the previous values as is done in (1). However, in our particular case this alternative was not implemented. In the next section, we motivate the estimator (1) via a Bayesian argument. 3.2. The Bayes estimator Often in practice, the need to reflect engineering changes or managerial interventions into modified estimators at any particular stage of testing is highly desirable. One way to accomplish this is by using the following Bayesian scheme. We start off by assigning a prior distribution to pi, and then obtain the Bayes estimator of Pi by using r^ and A^i. The posterior distribution of pi is taken to be the prior distribution of pi, and the Bayes estimator of p^ is obtained by using r-i and N2. This procedure is repeated for p^, pt, 4 N. D. SINGPURWALLA and so on. We note that if for some j, say j=A:, it is not desirable to use the posterior distribution of Pk-i as the prior for p^, then any desirable prior distribution can be assigned to p^. Since 0<Pi<l, j=l, 2, . . ., M, a natural choice for the prior on p) is a member of the beta family of distributions. Thus for instance, the prior density of pi could be of the form ^(2>i)=;g(^K"'(l-pi)''-S a, b>0, where B{u, v)= r x"-i(l-x)'-'rfx. It is easy to verify that the Bayes estimator of pi, p*, is pr={r,+b)|{N^+a+b). Upon continuing with the procedure discussed above, we can verify (see Singpurwalla and Lomas [17] that the Bayes estimator of p^, p*, is given as p*={p^ ^*+ ^)/(i ^i+«+ ^) ■ In the absence of any knowledge about pi, it is reasonable to take g(pi) as the uniform distribu- tion, so that a=b=l. In this case p* becomes (2) P*^{ti ''"^0/(i^ ^'~^^)' and this is akin to the estimator pj given in section 3.1. Often, in practice (see Box and Jenkins [4]), it is required that we work with transformations of the estimators given by (1) and (2). These transformations are alluded to in section 4.1. 4. TIME SERIES ANALYSIS OF BINOMIAL DATA As stated before, our approach construes the sequence {p/U^ (or its transformed values) as the realization of a stochastic process. It then uses this sequence to estimate a time series model of the type suggested by Box and Jenkins [4]. Box and Jenkins propose a class of models wherein the values {Pi}i'^ (or their transformed values) are related by an ARIMA process. For purposes of model building and forecasting, it is particularly desirable that the values be related by an ARIMA model with normally distributed residuals. In practice of course, one does not know if the relationship, as described above, is appropriate for the particular problem at hand. The prescription given by Box and Jenkins (Chapter 8) is to go ahead and fit a parsimonious ARIMA model, and then to examine the residuals from this model to ascertain if they are uncorrelated and normally distributed. Of the several ways in which this can be done, the portmanteau lack of fit test of Box and Pierce [5] and the cumulative periodogram test are the most common. In addition to this, one can test for the normality of the residuals by goodness of fit tests such as that proposed by Shapiro, Wilk, and Chen [15], or by Lilliefors [12], or by merely plotting them on normal probability paper. RELIABILITY GROWTH USING TIME SERIES 5 The ARIMA model can be generalized to include a trend term and/or seasonal terms. This generalized model is denoted by {p, d, q) X (P, D, Q)s-\-&o, where piP) and q{Q) denote the number of regular (seasonal) autoregressive and moving average terms, respectively, s denotes the length of the period for the seasonal terms, and d(D) denotes the order of regular (seasonal) differences in the model. The parameter 60, when nonzero, signifies the presence of a trend term in the model. The Box and Jenkins method essentially involves three phases, and all three phases will have to be undertaken to estimate reliability growth. The model identification phase enables us to obtain initial guesses of piP), d{D), qiQ), and s by an examination of the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the sequence under question. The parameter estimation phase enables us to obtain estimates of the parameters of the tentatively identified model. The estimation phase is greatly facilitated by the availability of a computer program, and this is an attractive feature of using the method of Box and Jenkins. The diagnostic checking phase follows the parameter estimation phase, and enables us to judge the goodness of the tentatively identified and estimated model. For the purpose of testing for reliability growth (decay), one has simply to see if d>l. Once an adequate model is chosen, forecasts of future reliability {Pj}m+i, and probability limits for these forecasts, can be obtained. To summarize, we claim that an attractive feature of testing for reliability growth using time series analysis is that it is not necessary to postulate (a priori) a model for reliability growth, and that reliability growth (or decay) can be checked in the presence of periodic components. In the past, the presence of periodic components in the analysis of failure data has not been accounted for by reliability analysts. This is because their existence in reliability problems has not been recognized nor understood. However, retrospective failure data, such as that presented in section 5 of this paper, or that presented by Singpurwalla [16], reveals that this may not be so. Of course, if periodic com- ponents are discovered in the data, one should always make an effort to determine the physical source which is responsible for causing them. In addition to this, the time series-based methods have a built-in theory of forecasting, and this is useful when one works with problems of estimating future reliability. 4.1. Transformations of the estimated reliability A Because of the fact that the elements of the sequence {pj}i^ constitute binomial data, it might A be desirable to take transformations of the Pj. In taking such transformations, one hopes to make the data normally distributed witli constant variance (See Granger and Newbold [11]). An example of such a transformation is the empirical logistic transform discussed by Cox [7]. A general procedure for determining a suitable transformation of the data is given by Box and Cox [3]. The use of such transformations in time series analysis is mentioned, for example, in Box and Jenkins [4], by Zarembka [18] and by Wilson in the discussion in Chatfield and Prothero [6]. Box and Cox use the method of maximum likelihood for obtaining the correct parameter for a class of power transforms. The logarithm, the square root, and the inverse transform are all special cases of the power transform. Considerations leading to the choice of the arcsine square root transform (sin"' V-) are discussed by Scheff^ [14]. For an analysis of the kind of data suggested by this problem, we recommend that any one of the transforms suggested above be used. The manner in which one goes about choosing a transform 6 N. D. SINGPURWALLA is by performing a time series analysis on the transformed as well as the un transformed data, and then choosing that transformation which gives the best results in terms of the analysis of residuals as suggested in Chapter 8 of Box and Jenkins [4]. For the situation discussed in this paper, there was a preference for using a Bayesian scheme; thus, the estimator chosen by us was of the form given by (2). 5. AN ILLUSTRATIVE EXAMPLE In this section we attempt to illustrate our procedure by considering a realistic example. The data for this example has arisen out of tests performed on a large system of the U.S. Navy. The data has been camouflaged in a manner which does not destroy its basic character. We would like to emphasize that the time series analysis of the data presented here represents our best effort in terms of identifying and fitting a Box-Jenkins type model. One can conceivably perform a more subtle analysis of the data and come up with a different and perhaps more appropriate Box-Jenkins type model than the one which we have obtained. Alternatively, one can perform a Fourier analysis of this data and obtain a further insight into its nature. Because of our lack of familiarity with this technique, we have, for now, chosen not to imdertake this endeavor. Our objective in presenting this example is purely an illustrative one. Our main thrust here is the idea of looking at the problem of reliability growth as one of time series analysis. We choose Pjtj—l, 2, . . ., 75, to be the estimator given by (2), and in Figure 1 we present a plot of the sequence {pj}. A visual examination of the plot reveals an initial increase in reliability, followed by a sharp drop, and then another increase followed by a gradual decay. In particular, the latter two-thirds of the series reveals a downward trend with a sawtooth-like periodic nature. Because of the appearance of this plot, one is inclined to simplify matters by deleting the first part of the series from one's analysis. However, in the interest of retaining the true character of this data, we choose not to do this, and attempt to fit the best Box-Jenkins type model based on the entire series. In Figure 2, we present a plot of the estimated autocorrelation function of the data. This plot reveals that the series is nonstationary, and so in Figures 3 and 4 we present the estimated auto- correlation and the partial autocorrelation of the first difference of the series. The spikes at lags 5, 10, and 15 indicate the presence of a periodic term of order 5 in the series. Since the series initially takes a large leap and then a sharp drop after about five lags, one can suspect that the large values of the estimated autocorrelation function at lag 5 may be due to this phenomenon. In view of this, the autocorrelation function and the partial autocorrelation function were recomputed excluding the first six observations of the series. Such a calculation reconfirmed the periodic behavior as observed with all the data, and this plus a visual inspection of the series allows us to conclude that the change in reliability occurs with a periodic component. A retrospective investigation into the source of this periodicity revealed an explanation for it, and this pertained to the fact that the items were almost invariably tested in groups of five, and that there was a certain pattern in the manner in which the successive tests were conducted. Based upon an examination of Figures 3 and 4, and the fifth seasonal difference of the data, several Box-Jenkins type models were fitted to the data. Of all the models that were attempted, the multiphcative seasonal model (0, 1, 0)X(0, 1, 1)5+60 appeared to be the best candidate. An examination of the autocorrelation function of the residuals from this model (Figure 4-a) reveals a RELIABILITY GROWTH USING TIME SERIES 35 40 45 TRIAL NUMBER 60 65 70 75 Figure 1. Plot of the sequence (pj) vs. j. 15 to 14 12 10 1.0 08 06 04 02 00 -02 -0.4 ESTIMATED AUTOCORRELATIONS Figure 2. Estimated autocorrelation function. N. D. SINGPURWALLA (NOTE PERIODIC COMPONENT AT LAG 5) . 15 ,„ I 14 10 % I 1 1 1 —I -1 1 ■ I 1.0 0.8 06 0.4 2 -0.2 -0 4 ESTIMATED AUTOCORRELATIONS Figure 3. Estimated autocorrelation of 1st difference. (NOTE PERIODIC COMPONENT AT LAG 5) ,5,0 I To 14 3 13 II 10 9 6 7 _6 5 4 3 I 10 8 6 0.4 0.2 00 -0.2 -0.4 ESTIMATED PARTIAL AUTOCORRELATIONS Figure 4. Estimated partial autocorrelation of 1st difference. RELIABILITY GROWTH USING TIME SERIES in V) 1 * o < -I 1 o ' * in 1 "^ <n\ 1 O" K 1 K> zl 3| - ujI o 1 m Zl 1 CM u' ■ 91 zl o, 1 o u 1 1 cy in 1 in — ' 9 ■ 1 "" . 1.0 0.8 6 4 02 0.0 -0.2 -0.4 ESTIMATED AUTOCORRELATION Figure 4-a. Estimated autocorrelations of the residuals from model. large value at lag 3, suggesting a (0, 0, 3) model for the residuals. Based upon these considerations, a final model which adequately describes the given data is of the form il-B){l-B')p,=do+il-9,B'){l-d,B')aj. We remark that in the above model, the moving average parameters associated with B^ and 5^ are not significant, and that the only significant parameters are di, a regularly moving average parameter, and 02, a seasonal moving average parameter. Thus the terms B^ and B^ do not appear in the model equation. The a^ are the residuals from this model, which constitute a white noise process. The trend term ^o was estimated to be— 0.004, and as is apparent, we conclude that there is a deterioration in reliability. The values of di and 02 were estimated as 0.23 and 0.63 , respectively. When the above values are incorporated into the model, and the model equation is spelled out, we have ^.= -0.004+^,_i-f-py-5-^,-6+a;-0.23 a;_3-0.63 a^_5+0.15 a^-g. In Figure 5, the bold curve shows forecasts of future reliability using the above model. We remark that the incorporation of a negative trend term maintains the pattern of degradation for lags extending into the indefinite future. From the point of view of forecasting, this might be ob- jectionable. This is because unless there is a physical reason for reliability to degrade systematically, one would assume that sooner or later the trend would be reversed. 10 N. D. SINGPURWALLA ,\ /\ FORECAST WITHOUT TREND TERM V /*^^. FORECAST USING TREND TERM — r- 15 — r 20 10 FUTURE TRIALS Figure 5. Forecast of future values. In view of the above, it is desirable to obtain forecasts by arbitrarily setting do equal to zero. In Figure 5 the dotted curve shows forecasts of future reliability after such a change has been made. Since 0o=— 0.004, the similarity between the two curves in Figure 5 is not too surprising. In order to verify the adequacy of our model, and in order to ascertain that the assumptions underlying the technique are satisfied, we performed an analysis of the residuals from the final model. In Figures 6 and 7 we present plots of the autocorrelations and the partial autocorrelations of the residuals, and note that these are not significant at the 5% level. A portmanteau lack of fit showed that the sum of squares of the autocorrelations of the residuals is 7.3 with 21 degrees of freedom. Thus, per the chi-squared test, this value is not significant. In Figure 8 we present a plot of the periodogram of the residuals, and note that the periodocities in the data have been reasonably well accounted for. In order to see if the residuals are (approximately) normally dis- tributed, we first plot them on normal probability paper. In Figure 9 we display this plot, and note that except for the two outliers at the end of the plot, the residuals are fairly normal. The two outliers are caused by the two large jumps in the initial portion of the series, which as discussed before could have been eliminated from consideration by us. In addition to this, we performed the test for normality proposed by Lilliefors [12], and found that the residuals pass the test at at the 1% level of significance. The Shapiro, Wilk, and Chen [15] test is tabulated for use with 50 or less observations, and thus it could not be used here. RELIABILITY GROWTH USING TIME SERIES 11 1 1 1 <r> - 1 * 1 — -1 O 1 * 1 ^ <n 1 z 1 O _l 1 1 "^ UJ 1 o Z 1 HI ' O 1 1 <<^ Z 1 ol 1 o . 1 o CJ " 0«l m 1 (T> I 1 — I :1 1 O 1 in _j. T 1.0 0.8 0,6 0.4 0.2 0.0 -0.2 -0.4 ESTIMATED AUTOCORRELATION Figure 6. Estimated autocorrelations of the residuals of the final model. J ' iri 1 *- OT < -J 1 9 > * - 1 to' lO - col g ' o _ll 1 "> 1 Zl Ull a. 1 m CJ u.' - §1 o O — 1 lO - 1 o in —\ 1 - — r*- I 1 1 r 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 ESTIMATED AUTOCORRELATION Figure 7. Estimated partial autocorrelations of the residuals of the final model. 12 N. D. SINGPURWALLA -r 0.2 03 FREQUENCY 05 Figure 8. Cumulative periodogram of residuals. 5.1. Conclusions Because of the above tests, we claim that our chosen model is adequate for describing the given data. The analysis of residuals discussed above leads us to conclusions which are satisfactory if we are willing to make some minor concessions — perhaps this is a necessary attitude that one has to take in any endeavor involving model building. Our chosen model captures the essential features of the latter two-thirds of the series, a phase during which the process of reliability deteriora- tion achieves a stable pattern. Our chosen model may cause some concern because it leads us to two outliers during the initial portion of the series. This happens because the reliability experiences significant changes in the initial stages of development, and this is compounded with the fact that there is a comparative lack of smoothing during these stages. In principle, it is possible to account for such drastic behavior in the data by the use of dummy variables. However, this is meaningful only if one has an intimate knowledge of the system, and can attribute such behavior to a specific cause. Thus, the two outliers will have to be judged in the light of the above comments. As a final word, our main objective here is to advocate the use of time series modeling tech- niques for the analysis of reliability growth problems. We have used the Box-Jenkins technique as a way of illustrating our approach. Refinements to what we have done are possible either by way of performing a better Box-Jenkins analysis of this data, or by analyzing this data by alternate time series-based techniques. ACKNOWLEDGMENTS The author is grateful to the referees for their keen interest and their excellent comments. He also acknowledges the assistance of Mr. Mahesh Chandra with the analysis of the data. RELIABILITY GROWTH USING TIME SERIES 13 X x" X X X ^ X t X xx X X* X ^ X jX X X x" X X X X ( X X « -0.04 -0 03 -0.02 -001 -0.00 0.01 0.02 0.03 Figure 9. Cumulative frequency of residues. BIBLIOGRAPHY [1] Barlow, R. E. and E. M. Scheuer, "Reliability Growth During a Development Testing Pro- gram," Technometrics 8 (1) 53-60 (1966). [2] Bloomfield, P., Fourier Analysis of Time Series: An Introduction (John Wiley and Sons, Inc., New York 1976). [3] Box, G. E. P. and D. R. Cox, "An Analysis of Transformations," Journal of the Royal Statis- tical Society Series B ^^ 211 (1964). [4] Box, G. E. P. and G. M. Jenkins, Time Series Analysis, Forecasting and Control (Holden-Day, Inc., San Francisco 1970). 14 N. D. SINGPURWALLA [5] Box, G. E. P. and D. A. Pierce, "Distribution of Residual Autocorrelations in Autoregressive Integrated Moving Average Time Series Models," Journal of the American Statistical As- sociation 64 (1970). [6] Chatfield, C. and D. L. Prothero, "Box-Jenkins Seasonal Forecasting: Problems in a Case Study," Journal of the Royal Statistical Society Series A 136 (1973). [7] Cox, D. R., The Analysis of Binary Data (Methuen and Company, Ltd., London 1970), Methuen's Monographs on Applied Probability and Statistics. [8] Crow, L. H., "Reliability Analysis for Complex, Repairable Systems," in Reliability and Biometry, F. Proschan and R. Serfiing, Eds. (SIAM, Philadelphia, Pennsylvania 1974). [9] Donelson, J. Ill, "Duane's Reliability Growth Model as a Nonhomogeneous Poisson Process," Institute for Defense Analyses Paper, Arlington, Virginia (1975) (to appear). [10] Goodman, L. A., "Interactions in Multidimensional Contingency Tables," Annals of Mathe- matical Statistics 35 636-646 (1964). [11] Granger, C. W. J. and P. Newbold, "Forecasting Transformed Series," Journal of the Royal Statistical Society, Series (B), 38 (2) 189-203 (1976). [12] Lilliefors, H. W., "The Kolmogorov Test for Normality with Mean and Variances Unknown," Journal American Statistical Association 62 399-402 (1967). [13] Rice, J. and M. Rosenblatt, "Estimation of the Log Survivor Function and the Hazard Function," Technical report. University of California, San Diego, California (1975). [14] Scheff^, H., The Analysis of Variance (John Wiley and Sons, Inc., New York, 1959). [15] Shapiro, S. S., M. B. Wilk, and H. J. Chen, "Comparative Study of Various Tests for Nor- mality," Journal of American Statistical Association 63 1343-1372 (1968). [16] Singpurwalla, N. D., "Time Series Analysis and Forecasting of Failure Rate Processes," In Reliability and Fault Tree Analysis, R. E. Barlow, J. B. Fussell, and N. D. Singpurwalla, eds. (Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania 1975). [17] Singpurwalla, N. D. and C. M. Lomas, "The Time Series Analysis of Binomial Data with Applications to Missile Reliability Assessment," Technical Memorandum Serial TM-64701, Program in Logistics, The George Washington University, Washington, D.C. (1974). [18] Zarembka, P., "Functional Form in the Demand for Money," Journal of American Statistical Association 63 502-511 (1968). GENERAL TRIGGER-OFF REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS Menachem Berg University of Haifa Haifa, Israel ABSTRACT A trigger-off replacement policy, suggested and analyzed in [1], for two-unit systems composed of identical units, is generalized and extended in this work in several ways. In the first part of the paper we obtain the appropriate integral equations for nonidentical units and then use them for a complete solution of the case of two units, whose lifetimes are distributed according to general Erlang dis- tributions. In the second part of the paper we extend the trigger-off policy itself by allowing preventive replacements of units which reach a certain critical age. The system stops working if either one of the two units fails or reaches its critical age. Both cases present natural replacement possibilities for the remaining unit, pro- vided that its age exceeds a predetermined critical age. Finally, we consider the question of which policy parameters, i.e. control limits and critical ages, to choose when facing a real-life situation. Using the criterion of expected costs per unit time in the long run, we show how to find the optimal parameters which minimize this objective function. The fact that a re- stricted optimization, within the class of trigger-off replacement policies, leads to the global optimal policy has been proved in [2] by a different approach. 1. INTRODUCTION AND SUMMARY A machine consists of two stochastically failing units. Failure of either of the units causes a breakdown of the machine and the failed unit has to be replaced immediately. Bansard, et al [1] suggested for this machine a trigger-off replacement procedure under which, at any failure epoch of either of the two units, we replace the unf ailed unit too, if its age exceeds a predetermined control limit L. We call this policy an Opportunistic Failure Replacement Policy (OFRP). The approach of Bansard, et al was purely descriptive. They have computed some long-run operating characteristics of the OFRP, namely, the expected number per unit time in the long run of single and common replacements of the units, in terms of the stationary joint p.d.f. of the ages of the units. This joint p.d.f., however, was expressed only in an implicit form, by means of an integral equation, depending on the p.d.f.'s of the lifetime of the units. Obtaining an exact solution of the problem depends therefore on our ability to solve this integral equation. This was done in [1] for exponential and uniform lifetime distributions. 15 16 M. BERG We shall generalize the above results in two ways. First we shall carry out the analysis for nonidentical units, as distinct from the identical units case discussed in [1], and obtain the appro- priate integral equations for the stationary joint p.d.f. of the ages of the units. These equations will then be solved for units whose lifetime is distributed according to general nonidentical Erlang distributions. Some special cases will then be discussed in detail. In another part of this paper we attempt to extend the policy itself. As we shall see later, our limited intervention under the OFRP, at failure epochs only, is not always adequate, and a more extensive replacement procedure is not only desirable but necessary. One such procedure is called an Opportunistic Age Replacement Policy (OARP). Under the OARP, apart from continuing with opportunistic replacements at failure epochs as prescribed by the OFRP, we also replace every unit when it reaches a predetermined critical age S. At planned replacement epochs as well as at failure epochs, we allow the replacement of the second imit if its age exceeds its control limit. The OARP is a generalization of both the OFRP and independent ARP's for the two units, since these two policies can be obtained from the OARP by setting (S'= 0° and L— <» , respectively. The OARP may appear to be similar to the policy suggested in [3] but there is a critical differ- ence. In [3] only one unit is subject to preventive replacement, while the other unit is merely providing replacement opportunities for this unit. In the last part of this work we come to the inevitable problem when facing a real-life situation namely, which policy parameters {Li for the OFRP; Li and Si for the OARP) to choose. Although the above descriptive approach does not deal directly with this question, its results can be readily used to tackle it. Assume that a cost structure is imposed on the problem, including replacement and running costs of the units, with an aim to minimize the expected total costs per unit time in the long run. We want to find the optimal values of the policy parameters for which this objective function attains its minimum. In order to accomplish this, we first compute our objective function in terms of the parameters of the policy under discussion (OFRP or OARP). This computation is easy since the expected cost per unit time in the long run can be expressed in terms of some basic long-run operating characteristics of the policy. These characteristics are readily computable by our method. Obtaining the optimal values of the policy parameters which minimize the objective function becomes an exercise in calculus. This optimization procedure will be carried out in detail for some special cases. It should, however, be noted that this is in fact a restricted optimization; i.e., we have found the best (say) OFRP among all possible OFRP's. A pertinent question that might be asked is how good the best OFRP is in comparison with other possible replacement procedures which limit planned interventions to failure epochs only. Employing a different technique, [2] answers this question. It is proved there that the best OFRP is globally optimal amongst all possible stationary and nonstationary replacement procedures of this type. Although this was done there only for a special case, this result remains valid for the more general cases considered in this work. REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 17 2. AN OPPORTUNISTIC FAILURE REPLACEMENT POLICY (OFRP) 2.1. A general Solution for Nonidentical Units A machine consists of two stochastically failing units whose lifetimes are distributed according to a c.d.f. Fi{-){i=l, 2). The lifetime p.d.f.'s are denoted by /j(-) and the hazard functions are hi{-)=fii-)/F,{-), where F,{-) = 1 -F,{-). Failure of either of the two units causes a breakdown of the machine and the failed unit has to be replaced immediately. Any such breakdown of the machine provides a natural replacement opportunity for the unfailed unit, without the necessity of stopping an operating machine — an undesirable action in many real life situations. The replacement procedure which we choose to follow is the following: at any failure epoch of either of the two units, we replace the unfailed unit too, if its age exceeds a control limit ij (i=l, 2). We never intervene when the machine is operative. We call this policy an Opportunistic Failure Replacement Policy (OFRP). Let us now define Xi{t) as the age of unit i at time t and set Xi{o) = 0{i=l, 2). The stochastic process Xi{t) increases linearly with slope 1 until a replacement of unit i occurs, at which point it jumps back to zero and starts afresh. It is easy to verify that the two dependent processes Xi{t) and X2(t) always satisfy — Z<2<C Xi{t)—X2{tXLi. The feasible region of (Xiit), Xzit)) under the OFRP is therefore: (1) Dt={ixu X2)\0:<:Xi^t, max (0, Xi—Lj,)^X2<mm {t, Xi+Li)} . By Kolmogorov forward equations we have (2) p{xu X2, t)=p{x,-A, X2-A,t-A){l-hi{x-A)A){l-fh{x-A)A) for all (0, 0)<C{xiX2) iDt, where p{xi, X2, t) is the p.d.f. of {Xi{t), Xjit)). From (2) we can obtain a partial differential equation for p(xi, Xa, t): (3) a£+|+^=_p(,,(,,)+A.(x.)). Letting ^ go to <» in (3), we obtain a partial differential equation for p{xi, X2), the stationary joint p.d.f. of the ages of the units, (4) if: + if,= -Pihiix,) +h2{x2)), for all {x,, x^)^^. where, by (1), (5) D=Z>„ = {(xi, 3:2)10^X1^00, max (0, Xi — Zi):<X2<xi + Z2}. 18 M. BERG The general solution of (4) is (6) p(xi, Xi) = F,{x^)F2{x2)H{xy-Xi). Under the OFRP the process {Xx{t), Xzit)) jumps back to the origin after every common replace- ment of the units. Thus, a probability mass is concentrated on the path D' = {{xi, X2)\xi = X2=x}czD. Let q(x) be the probability density along this path. Then q{x)=qix-A){l-hi{x-A)^){l-h2{x-A)A), which yields the differential equation g ' (x) = - 2 (a;) ( Ai (x) + Aj (a;) ) . The general solution of this equation is given by (7) q{x)=F,{x)F2ix)H„. In order to find the function H(-) in (6) and the constant Hg in (7), we proceed as follows: the probability densities along the Xi and Xi axes which we denote by p(xi, 0) and p(0, 0:2), respectively, are given by p{xuO)=\ pixu X2)h2{x2)dx2, 0<Xi<Li (8) rxi+Li piO,X2)=\ pixi,X2)hi(xi)dxi, 0<ix2<L2. Substituting (6) and (7) into both sides of (8) yields two equations for H{x) and Hg, H{x)= r J2{x-u)H{u)du^-j2{x)H,, 0<x<L, J-Li (9) H{-x)= r Mx-u)H{-u)du-^Mx)H,, 0<x<L2. J-Lt Equations (9) determine H(x) and Ho up to a constant coefficient which is then obtained from the normalization equation (10) 1=1 p{xu X2)dxidx2-\- \ q{x)dx jD-iy Jiy J'" _ nzt+Li _ /•" _ _ F,{xi) F2{x2)H{x,-X2)dx2dx,+H„ F,{x)F2{x)dx. J max (fi, Xi— Lt) Jo Once we have obtained H{x) and Hg, and then p{xi, X2) and q{x) by (6) and (7), we can com- pute various long-run operating characteristics of our policy such as the expected number (per unit time in the long run) of: single replacements of unit 1, single replacements of unit 2, and common replacements of the units, denoted by Ni, N2, and ^12, respectively. REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 19 These quantities are given by (11) N,= { ' 'p{Q,X2)dx2, N^={^' p{x,,Q)dx„ N,2=q{o)=H„ Jo+ Jo+ as can easily be verified. 2.2. The Solution for General Erlang Lifetime Distributions To simplify the presentation we shall start with identical distributions and then generalize the results for nonidentical ones. The c.d.f. and p.d.f. of a general m-stage Erlang distribution, with scale parameter X>0, are, respectively, (12) F{x) = l-e-^''^^> f{x)=\e-^'j^^^^^-, x^O. po J] -^ (w^— 1) ! When the units are identical, i.e. both possess the same lifetime distribution, we have (13) Hix) = H(-x), for all x^O, and hence it is sufficient to consider positive x only. In this special case, (9) becomes (14) e^H^{x)=j_^ X-e^" ^|^^£^ H„{u)du+\'" ^^-yy-, H,„, 0<x<Z where Li=L2=Lis the common control limit of the two units. The subscript m has been added to H{x) and Ho to indicate the number of stages of the Erlang distribution under discussion. Differentiating m times both the right-hand side and the left-hand side of (14) jdelds m e^' J2 (T)^4*^ {x)\'--'^\"'e^H„{x) i = o or equivalently, m (15) j:{i)mHx)>^'"-'=o, 1=1 where ^4" {x)=^^^^^^' i^l and i^r {x)^H„{x). ax The general solution of the differential equation of order m (15) is given by TO (16) H^{x)=^ Cje^'"'-'''' where wi, . . ., w^ are the mth roots of unity. These roots can be represented as a geometric se- quence (17) w< = e^'' •''"", j=l, ..., m where i=V—l- We still need the coefficients Cj(j=l, . . ., m) and the constant Hom- Inserting (16) into both sides of (14) (while recalling (13)), we can obtain after a series of mathematical operations a 20. M. BERG polynomial in x, of order m— 1, which is identically zero. Hence, all its coefficients must be zero. Calculating these coefficients and equating them to zero yields, after a series of mathematical manipulations, (18) ^arjCj^O, r=0, 1, . . ., m— 2 2_i Cl(m-l)}Cj — ^"o; ; = 1 e-xM2-.,) m-j^-r [x(2-a,,)Lr 1 r=0,...,m-l Uyj a.,-co,-t-^2_a,,)-- t^ t! (2-co,)"-^ i=l, . . ., m (18) is a set of m linear equations in m+1 variables Ci, . . ., c^ and Hom, with known coefficients. One more independent linear equation in these variables is obtained by inserting (16) into the normalization equation (10). A straightforward solution of this set of linear equations yields H^ix) and Ho,n and thus completes the solution for units with identical Erlang lifetime distributions. The analysis for nonidentical Erlang distributions follows similar lines. The final results for different scale parameters Xj, for unit 1, and X2, for unit 2, are (20) Hrnix)-- where Cij (^=l, 2;j=l, . . . m) satisfy the linear equations m m XI Cuco/+X) C2ja2rj=0, r=0, 1 . . ., m— 2 ;=1 i=l (21) m m plus an analogous set of equations with the indices 1 and 2 interchanged. The constants a<r^ are given by r=0 Tti — 1 /X X'"-'' / m-l-r (\ T \'^ \ I • • •> 1 = 1, 2 where (23) Xij=Xi + X2— X2W;; X2j=Xi + X2— XiW;. One more independent linear equation in Ca and Hom is, again, the normalization equation. 2.3. Some Special Cases We shall now illustrate the above general solution for some particular Erlang distributions. REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 21 EXAMPLE 1. Nonidentical exponential distributions: An exponential disribution is a one- stage Erlang distribution with c.d.f. and p.d.f. (24) 7^,U) = l-e-^•^ /,(x)=X.e-x.^ x^O; i=l,2. Hence, by (20), Cii, 0<3-<L, (25) Hi(x)= Cl2, — Z2<X<0. From (23) we have (26) Xii=^i, ^■2i=X2, since wi = l. Substituting (26) into (22) yields aioi=-e-^''^'-l, i=l,2, and hence the set of equations (21) becomes here C2i(e-^^^^-l)+c„=X2^<,i (27) c„(e-^>^'-l)+C2,=X,i/„,. Rearranging the normalization equation yields one more linear equation in Cn, C21, and Hoi: (28) l=Cii e-\^' -^2^2(^X20^X1 +C21 e-^'^'-^2^2<iz2C^Xi-h^oi e'^^'+^^^^dx. Jo Jmax (ii— ii, 0) J J xi Jo Solving (27) and (28) for Cu, C21 (and Hgi), we obtain Hi(x). Inserting fli (x) , Hoi, and (24) into (6) and (7), we find the required stationary joint density of the ages of the units: (29) r^^'^^^j^- (Xi, X2)€L' e-M^l-M^2 >^lX2(Xi+X2) y ^ X, + X2(l-e-^'^0 '^'^'^^ ^ X2+Xi(l-e-^2^2) ^i^^2 (30) ,(^)=,-(M^x.,. X.X2(X,+X2)(.-^'^-+.-^2^'-e-^-^--^2^2 . ^ ^ ^^'^^ [X2+Xi(l-e-^2^»)][Xi+X2(l-e-^'^0] ^ We can now use (11) to compute some long-run operating characteristics of the policy: Xr^ ^i(Xi + X2)(l-e-^'^') ir,_ X2(Xi + X2)(l-e-^'^') "^^ X2-f-Xi(l-e-^2i-2) ^^2 Xi-FX2(l-e-^i^') (31) Iv ^ ^'^2(^1+^2) (^"^'^'+g~^''^'~g"^'^'"^''^') " [X.+X2(l-e-^'^>)][X2+Xi(l-e-^^^')] * EXAMPLE 2. Identical Two-stage Erlang Distribution: On setting m=2 in (12) we get a two- stage Erlang distribution, F(x) = l-c-^'(l + Xx), J{x)=\^xe-^', x^O. 22 M. BERG Hence, by (16) and (13), (32) H,{x)^c,e-'^\'^\+c„ \x\<L, since the square roots of unity are (33) Wi=-1, C02=l. Substituting (33) into (19) yields (34) a<„=^(8+e-3^MH-3Xi)), a<,2=e-^^(l+XL), a„=i (e-^^^-4), a,2=e-^^ so that equations (18) become here c,[8+e-3^^(l+3XL)]/9+C2e-^^(l+XL)=0 (35) c,(e-3^^-4)/3+C2e-^^=i^„2. From (35) we obtain C2=Ci.[8+e-3^^(l+3XZ]/9e-^^(l+XL) (36) /f<,2=2ci(c-3^'^-6Xi:-10)/9X(l + XL), which solves H^ix) and H02 (and hence 2?(a;i, ^2) and q{x)) up to a constant coefficient. This coefficient can be obtained readily, though somewhat tediously, by using the normalization equation (see the preceding example) . 3. OPPORTUNISTIC AGE REPLACEMENT POLICY (OARP) Under an OFRP we carry out planned replacements at failure epochs only. In some cases this limited intervention is not adequate and a more extensive replacement procedure is desirable. The Opportunistic Age Replacement Policy which we shall study extends the OFRP by also allowing planned replacements of units which reach a predetermined age S, which we call the critical age. Any such age replacement of a unit necessitates the stopping of the machine and thus creates a planned replacement opportunity as well as at failure epochs, for the second unit. The replace- ment of the unfailed unit will actually be carried out if its age exceeds the control limit L. 3.1. The General Solution To simplify the presentation we shall assume here that the units are identical. Generalization of the solution for nonidentical units can be made according to the analysis in the previous section. First, we note that the problem is interesting only if S^L. Otherwise, the OARP degenerates to independent individual age replacement policies for the two units with no common replacements whatsoever. In this case the stationary joint p.d.f. of the ages of the units is simply given by p{xi,X2)=p{xi)p{x2), O^Xi^S; i=l,2 REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 23 where p{-) is the stationary p.d.f. of the age of a single unit undergoing an age replacement policy (ARP). It can be shown that r F{u)du so that (37) p(xuX2)=y^P^^^^^,' O^xKS; i=l,2. Xjy(u)duJ Clearly we shall have no probability mass, in this case, on the path Z>'={(a;,, x2)/xi = X2=x}, since no common replacements of the units are made. So we shall assume in the sequel that S^L. Repeating the arguments of section 2.1, we find that the stationary joint p.d.f. of the ages of the units, under an OARP, is given by (38) p {xi, X2)=Fixi) Fixi) H{xi-X2) , for {xi, X2) tD, (39) q{x)=F'{x)H,, x^O, where the feasible region of (Xi, X2) is (40) D,= {{xu X2)/0^Xi<S, max (0, xi-L)^X2<n\m (xi+L, S)}. The probability density along the Xi axis is rmin(zi-Li, S-) p{xi,0)=\ p{xu X2)h(x2)dx2+p{xi, S) 0<a;i<L. Substituting (37) and (38) into (40) yields an integral equation for H{x), (41) H{x)=r Jix-u)H{u)du+f(x)Ho+F{S)H{x-S), 0<x<L. J max (-i, (i-S)+) Because the units are identical we have (42) H{x)=H{-x), 0<x<L. Some further analysis of (41) shows that the function H{x) and the constant Hg assume different forms in different regions of the plane {S, L) and the interval (0<^z<Z). More precisely, it can be verified that H^ix), 0<a;<Z; S^2L H{x)= H,{x), 0<x^S-L; 2L>S>L Hiix), S-L<Cx<L- 2L>S>L Ho- Ho, 2L>S>L, 24 M. BERG where Hi{x), H^ix), and H^ix) are symmetric around zero and satisfy the integral equations (43) H,{x)={^~^ J{x-u)Hi{u)du+V j{x-u)H,{u)du-\-]{x)H„, 0<x^S-L J-L JL-S (44) Hi{x)= f^ ^ j{x-u)H2{u)du-\- f ^ f{x-n)H,{u)du Jx-S jL-S + r fix-u)H2iu)dui-f{x)H,+F{S)H2{x-S), S-L<x<L JS-L (45) H3{x)= p_J{x-u)H,{u)du+fix)H,. One more independent equation for H{x) and Ho is obtained from the normalization equation (46) 1=11 p{xu X2)dxidx2+ I qix)dx. JjD.-iynD, jD-nD. Having solved this set of equations for Hix) and Ho, we immediately obtain p(xi, X2) and q{x) from (38) and (39), respectively. We can then proceed to compute various long-run operating characteristics of the OARP. Under the OARP we have, besides single and common failure replacements, also single and common planned replacements (when either of the units reaches its critical age). As in the OFRP, we denote by N< (i=l, 2) and A^i2 the expected number per unit time in the long-run of single failure replacements of unit i and common failure replacements, respectively. The expected number of single planned replacements of unit i {i—\, 2) and common planned replacements for the units will be denoted by N t and N^, respectively. The following formulae are easily verified : (47) N^= f^ p{x,S)dx, L<S<2L Js-L i=l, 2 0, S^2L (48) Nt+Ni= f^ pix, o)dx Jo+ (49) N^2=2 j^pix, S)dx+qiS); Nn+Nu=qio)=H{o) 3.2. Example: Two Identical Exponential Distributions We shall now use the above procedure for a detailed solution for a machine consisting of two identical units, each possessing an expotennial lifetime distribution /^(x) = l-c-'^^ x^O; Jix)=\e-^', x>0. To findp(a;i, X2) and q{x) we have to solve the integral equations (45), (43), and (44) for H{x) and Hg. These equations become here (50) H3(x)=r \e-^''--'H^{u)du + \e-^''Ho, 0<x<L REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 25 (51) H,ix)=e-^'[ (^ ^Xe^''H2iu)du+r \e^''H,{u)du]+\e-^'Ho, 0<x^<S-Z r rL-s rs-L rx -\ (52) H2{x)=-e-^'\ \e^''H2{u)du-\- \ Xe^"Hi{u)du-{- \ \e^"H2{u)du\ + e-^^H2{x-S)+\e-^'Ho, S-L<ix<L. Solving (50), we find that iy3(x) is a constant H^, satisfying Using the normalization equation, in the region S^2L, we finally obtain (53) £?3=2X2e^V[-l+2e^^-e-2^^(l+2e2^^-2e^^)] (54) H„=2X/[-l+2e^'^-e-2^^(l+2e2X'^-2e^-^)]. Proceeding to the region S<i2L, we first turn to (51). It can be verified that the solution of this equation is a constant Hi, satisfying (55) //i=e^(5-^' r r~^ \e^"H2{u)du+\H„']- Inserting Hi(x)=Hi into (52) and differentiating the resulting equation we obtain, after some manipulations, the differential equation (56) HUx)=e-^^H'2{x-S), S-L<x<L. Noting that S —L<^x<^L implies — Z<x —S<iL —S, and recalling that H2{x) is symmetric around zero, we can deduce from (56) that ^2(2^) is a constant too: (57) H2{x)=Ih. (Otherwise, substituting y=x—Sm (56) will lead to a contradiction.) Inserting (57) into (55) yields (58) {Hi-H2)e^'^-''^+H2e-^''=\H,. Substituting Hi{x)=Hi and H2{x)=H2 into (52) yields (59) e-^'UHi-Hr) (e^<^-^>-e^(^-^')+x5j=0, for all S-L<x<L. Hence, (60) {H2-H1) (e^'^-^'-e^(^-^>)+X^o=0. Combining (58) and (60) yields the relations H2=.ff,/(l-e-^^)=X#„e-^V(l-e''^''-^') which solve ^2(2;), Hz{x), and Ho up to a constant coefficient. Using the normalization equation, in the region L<CS<C2L, we finally obtain (61) H,=\yV, i?2=XV(l-e'^)F, Ho=\e-^''il-e'^''-^')/il-e-^^)V 26 M. BERG where The stationary joint p.d.f. of the ages of the units is therefore (by (38) and (39)) (xi, x^tB, (62) (63) 3(3:) = e-x^-i+-2)H,, S-L<\x,-x,\<L; L<S<2L e-'^'H„ x^O; L<S<2L where ^3, Hg, Hi, H2, and Hg are given in (53), (54), and (61). The expected number per unit time in the long run of the various types of replacements which we have listed above can now be obtained by inserting (62) and (63) into equations (47)-(49) : (64) A^=A^2-= (65) Ni=N2= (66) Ni,= 0, S^2L 2X(e^^-l)/t7, S^2L X(l-e-^'^)(l-e-^^)/F, L<S<2L 2Xe-'^^l-\-2e'^''-2e^'')/U Xe-^^[2(e-^-^-e-^^)(l-e-^^) + e-'<^+^'(l-e''<-^-^>)]/(l-c-^^)y, L<S<2L (67) Ni,= where 2X[l-e-2^^(l+2e='^^-2e^'^)]/C/, S^2L 2X(F-l+e-^^)/F, L<S<2L [7=_l_|.2gXL_g-2xs(i_|_2g2xi_2eXL)_ It is constructive to note that A^i+A^2+j'Vi2=2X in both regions, since planned replacements have no effect on the Poisson failure processes of the two units (whose combined rate is 2X per unit time) . Before closing this section it is interesting to examine some boundary cases. CASE 1. S=L: Inserting S=L in (62) and (63), while noting that the relevant region in this case is L<iS<C2L, we obtain (68) ^(a.j^a.^)=g-x(x,+.,)__^!_, o<x,<<S; i = l, 2 and g(x)=0 since D, here is the region D,= {{xi,X2)\Q<Xi^S}, i=\,2 and obviously, B,(^{{Xx,X2)l\Xi-X2\<S]. Equation (68) and 2(0;) =0 confirm, for the special case F{x)=^e~'^'', our earlier statement that the OARP degenerates, when Sr<:L (and S=Z in particular), to independent individual age replace- ment policies for the two units. REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 27 Case 2: iS— >oo) : Letting jS go to <» in (62) and (63), while noting that the relevant region in this case is S'5^2L, we see that pixi, X2) and q{x) coincide with their counterparts in the OFRP (compare with (29) and (30) when Xi = X2=^^)- This result agrees with our previous remark that the OFRP is the special case of the OARP when S-^ <^ . CASE 3. S=2L: This is a check of continuity. Inserting S=2L in (62) and (63), we get iden- tical expressions for pixi, X2) and q(x) along the common boundary of the regions S>^2L and L<^S<C 2L, in the {S, L) plane. 4. OPTIMAL CHOICE OF POLICY PARAMETERS We now come to the important question of which policy parameters to use. Turning first to the OFRP, we are actually asking which control limits should be employed when facing a real-life situation. In order to tackle this question, let us first impose the following cost structure on the problem. Let at be the cost of a single (failure) replacement of unit i b be the cost of a common replacement of the units (following a failure of one of them). Usually, max(ai, a2) <C6<^ai+a2. Besides replacement cost we also assume running costs of the units, which increase with their age due to maintenance costs, decrease in the output, etc. More precisely, let ri(x) be the marginal running cost per unit time of unit i at age x; i.e., j; ri{u)du is the total running cost of unit i up to age x. A commonly used objective function, when dealing with costs, is the expected cost per unit time in the long run. For an OFPR with control limits Li and Z2 this function is given by (69) C{U, U) =J2 aiNi+ bN,2+T, where (70) r = p{xuX2)iri{xi)+r2 {X2) ) dxi dx2+ \ q (x) (ri {x)-\-r2{x))dx JD-D' Jo is the total expected running cost per unit time in the long run (another long-run operating charnr,- teristicof theOFRP). The control limits which we shall want to use are, naturally, those for which C{Li, L2) attains its minimum. As an illustration, let us consider again the situation of Example 1 in section 2.2. In that example we had two nonidentical units both possessing an exponential lifetime distribution. Im- posing the above cost structure on this model, we first let r,(x) be a linear function of x\ (71) ri{x)=ptx, x^O; Pi>0; ^=l, 2. Computing r from (70) and inserting it, together with iV,(i=l, 2) and iV"i2 (from (31)), into (69) yields C{L„ L2). 28 M. BERG Differentiating C(Li, Z2) with respect to Zj and L^ and equating both derivatives to zero, we obtain two equations for the optimal control limits: It is easy to verify that R.H.S. of (72) (resp. (73)) increases with L^ (resp. Z2) from to <». Hence there exists a unique finite solution L* (resp. Z*) to this equation. It can be also verified that Lf (resp. Z*) decreases with px (resp. P2) and increases with h—a2 (resp. 6— aO, as one would have intuitively guessed. The minimum of C(Zi, L2) can be shown to be (74) c,(if,ii)=^ir+^z;+J|±£^+a,+..-6. It is instructive to examine how the optimal control limit behaves as we vary the running cost function r(x). Repeating the above procedure, when Xi=X2=X, we find that the equation for the optimal control limit L, when a general r{x) is considered, is (75) 6-a=2e>^^(e^'^-l) {"^ e-^'>^'r{x)dx- {^' e-^^r{x)dx. Jl Jo Assuming that /• 00 ''r(x)dx<^ it can be verified that the R.H.S. of (75) increases with L from to r, where A 1 r°° r' = -limr(z)— e ^''r{x)dx. A i->» Jo Hence, a unique finite solution L* will exist if A> (76) r>b-a. Otherwise, L*= 00 and the optimal OFRP degenerates to independent failure replacement poli- cies for the two units. A sufficient condition for (76) is that r{x) increases with x indefinitely (the linear running cost function, considered above, possesses this property). Further analysis of (75) reveals that its R.H.S. increases as a functional of r(x), with r'(x). Hence, (77) r',ix)>r'n{x)^L*<L*, where L* (resp. Lu) is the optimal control limit associated with Vi (resp. rn). When the assumption e~^^^r{x)dx<C'=° j; does not hold (for instance, with rix)=e^^'', x>0) theni*=0; i.e. we replace both units at every failure epoch. But, even this intensive planned replacement strategy does not reduce the high REPLACEMENT PROCEDURES FOR TWO-UNIT SYSTEMS 29 running costs incurred by aged units, and the expected cost per unit time in the long run will still be infinity. Thus, a more extensive replacement procedure is necessary in this case. This led us to consider the OARP under which we could prevent the units from becoming older than a predetermined critical age and thus check the increase in the running costs. The expected cost per unit time in the long run of an OARP with control limits Z< and critical ages Sf (i=l, 2) is given by (78) C{L:, L2, S„ 82)^^2 {N,ai+N,a,)+Nj+N,2b+?, i=l where Si is the cost of a single age replacement of unit i b is the cost of a common planned replacement of the two units. Usually, ai<Cai, 6<6. We can now compute C{Li, L2, Si, S2) using the formulae developed in the previous section, and then find the optimal policy parameters in a procedure analogous to the one described for the OFRP. ACKNOWLEDGMENT The author wishes to thank Professor B. Epstein for his help and interest. REFERENCES [1] Bansard, J. P., J. L. Descamps, G. Maarek, and G. Morihain, "Study of Stochastic Renewal Process Applied to a Trigger-Off Pohcy for Replacing Wliole Sets of Components," in Proceedings of the 5th International Conference on Operational Research, 1969 L. Lawrence, ed. pp. 235-264 (Tavistock, London 1970). [2] Berg, M., "Optimal Replacement Policies for Two Unit Machines with Increasing Running Costs I," Stochastic Processes and their Applications 4 80-106 (1976). [3] Jorgenson, D. W., J. J. McCall, and R. Radner, Optimal Replacement Policy (North Holland Amsterdam, 1967). THE SIMPLEX METHOD FOR INTEGRAL MULTICOMMODITY NETWORKS James R. Evans Department of Quantitative Analysis University of Cincinnati Cincinnati, Ohio ABSTRACT The simplex method is interpreted as a labeling procedure for certain classes of multicommodity flow problems in a manner similar to that for single commodity networks. As opposed to general multicommodity algorithms, no explicit matrix inversion is required; all simplex operations are performed graph-theoretically. INTRODUCTION The graph-theoretic structure of bases for single commodity network flow problems has been known since the early days of linear programming (see Dantzig [2], chapter 17). This structure results in considerable simplification of the simplex method by enabling one to employ efiicient labeling schemes in place of matrix inversion, and hence to develop quite fast computer codes [1, 7]. Hartman and Lasdon [9] have exploited these properties in tailoring the generalized upper bound- ing algorithm to multi-commodity network flow problems. Only a small working basis requires ex- plicit inversion; all other simplex computations are performed graph-theoretically. Computational experience for a code specially structured for transportation problems is reported in Kennington [12]. Recently, the author has been investigating classes of multicommodity networks with uni- modular constraint matrices that have equivalent formulations as single commodity flow problems [3, 4, 5, 6]. For these problems, the representation of a nonbasic column in terms of basic columns is a (-|-1, — 1, 0) — valued vector. Hence, it appears that the simplex method can be implemented by using labeling procedures similar to those used for single commodity problems without neces- sitating a transformation to single commodity equivalent networks. Suitable modifications to existing labeling procedures (e.g., Srinivasan and Thompson [14]) can be incorporated and applied to these problems. In this paper we specialize the simplex method as a labeling procedure for integral multicommodity networks. BASIS STRUCTURES IN SINGLE AND MULTICOMMODITY NETWORKS The general (uncapacitated) network flow problem for a single commodity is formulated as 31 32 J. R. EVANS (1) mm ex Ax=b x>0 where Xa is the flow on arc {i, j) with unit cost Ca, A is an MXN node-arc incidence matrix of the network, bt^O denotes a supply at node i, hi <C0 is a demand at node i, and 6*= indicates that node i is a pure transshipment node. We assume without loss of generality that "^ 6i=0. Let 5 be a basis chosen from among the columns of A after the last row of A is deleted, since the rank of A is M-\. The major result concerning single commodity network flows is THEOREM 1: B is a basis if and only if the associated columns form a spanning tree in the network. The theorem is illustrated in Figures 1 and 2. Such a characterization enables one to efl&ciently apply the simplex method to (1). Since each nonbasic arc forms a unique cycle in the network with the arcs of the tree, its representation in terms of the basic arcs can be obtained by a simple labeling process. Pricing-out and pivoting operations are then trivial. Details are provided in Dantzig [2]; for a more complete treatment with refinenjents, the reader is referred to Johnson [10, 11], and Srinivasan and Thompson [14]. A = *12 1 -1 *13 1 -1 "23 1 -1 "24 1 -1 "34 1 -1 Figure 1. A Network and Incidence Matrix. B = "12 1 -1 13 1 -1 ^34 (T 1 Figure 2. Spanning Tree and Basis Matrix. MULTICOMMODITY SIMPLEX METHOD The general multicommodity flow problem can be formulated as 33 (2.1) (2.2) (2.3) (2.4) min 2] c*3:* k = l Aicx''=b'' for all k k x*. s>0 for all k. Here, a;fj is the flow of commodity k on arc {i, j) with unit cost €%, Ak is the node-arc incidence matrix for commodity k, 6* is the supply-demand vector, u is a vector of upper bounds on the arcs, and s is a vector of slack variables. Through the application of generalized upper bounding [9, 13], it can be shown that any basis to this system can be expressed as the matrix B, B, Br E Sx J U In this representation, Bi, B2, . . ., Br are bases (trees) for commodities 1, 2, . . ., r. The set J consists of additional flow variables, and U represents the set of slack variables that are basic. If Xij^tJ, this arc forms a unique cycle with respect to B^. The result is that a basis to a multicommod- ity flow problem consists of a set of spanning trees and (possibly) a set of cycles in the network. A general solution procedure is given in Hartman and Lasdon [9]. By exploiting this basis structure, one need only maintain a "working basis" inverse Sc^. All other simplex computations are per- formed graph-theoretically. INTEGRAL MULTICOMMODITY NETWORKS In general, the matrix Si is not unimodular. Therefore, it is quite possible, and often the case, that a basic feasible solution to a multicommodity network flow problem will not be integer- valued. However, certain classes of problems with unimodular contraint matrices have been identi- fied. For obvious reasons, these are called integral multicommodity networks (IMN's). Among these are the class of rn-source, n-sink, r-commodity transportation problems in which m=2 orn=2 for all r>2 [3], and multiproduct dynamic production planning problems [4, 6]. These classes of problems have been shown to have equivalent formulations as single commodity flow problems. A more general transformation definuig other integral multicommodity networks is given in [5]. IMN's possess the Dantzig property [2]: the coefiicients in the representation of a nonbasic variable in terms of the basic variables are -)- 1, — 1, or 0. Since this property enables one to develop efficient labeling schemes for single commodity networks, it is plausible to extend this idea to IMN's. 34 J- R- EVANS Although IMN's can be solved by transformation to their equivalent single commodity representations, a tailored labeling method based upon the original network structure would appear to be more effective, since the transformation is eliminated. It is anticipated that such methods would be useful in decomposition and aggregation methods which exploit integral substructures of more general multicommodity networks. SIMPLEX PIVOTING PROCEDURES In this section we present two methods of interpreting the simplex method on an IMN. These are analogous to the "stepping-stone" and "duality" approaches for solving ordinary transportation problems (see Hadley [8]). We assume that at some intermediate stage, a basic feasible solution is known. To apply the simplex entry and exit criteria, one must know the representation of a nonbasic tableau column in terms of the basic columns. Let Bi, B2, . . ., Br be the associated spanning trees and U the set of basic slack variables to a particular basis. If SijiU, then there exists some flow variable xf, for some k that is basic, but not a member of a tree. This must be true since every feasible basis must contain a member of each row of (2.3), but cannot contain variables associated with {5*}. With respect to B^, Xi, determines a unique cycle Cf, . This follows from the well-known fact that if T is a spanning tree of a graph and e ^ T, then T\Je contains a unique cycle ([9], p. 342). Consider first a nonbasic flow variable Xpg. To find the coefficients in the representation of Xpu in terms of basic variables, we proceed as follows. To each basic arc x'^j associate a label 5*; €{ — 1, 0, 1}. To each slack Sij, associate a label jij e{ — 1, 0, 1}. Initially, all deltas and gammas are zero. Traverse the cycle C^, in the direction of Xp^, assigning 8ij = -\-l to each forward arc, and dij= — l to each reverse arc. During this process, if any arc whose slack variable is nonbasic is traversed, say Xp,g,, repeat the labeling process for Cp/,/ (where k' is the commodity such that Xp',gf forms the cycle with respect to B^,). Cp',g/ is traversed in the direction of Xp',^, if 8p,g, = — l and in the opposite direction if 5p/j/ = -|-L This procedure is continued until all saturated arcs (those for which the slack variables are nonbasic) have S* 6?; = 0. In addition, the gammas are defined as follows: if 2^ 5*;=(-f-l, —1, 0), then 7,^=(— 1, -|-1, 0), respectively. The fact that this procedure terminates with all deltas -f-1, —1, or follows from the Dantzig property and unique representation of the nonbasic variables. The reduced cost Ctj for any nonbasic arc is simply the product of c*j and 5*^, summed over all arcs and commodities. To find the appropriate deltas for nonbasic slack variables, note that if Stj is to enter the basis, flow on the arc Xtj which formed Cf^ must be reduced. The labeling procedure above is followed, except that tlie cycle is traversed in the direction opposite to xlj. The most negative reduced cost determines the entering variable. The exiting variable is simply the arc whose flow is minimum among all arcs whose label is — L We illustrate these concepts with an example. Consider the basis for a 3-source, 2-sink, 2- commodity transportation problem illustrated in Figure 3. XI2 is basic and generates the cycle Cm- The slack variable S22 is nonbasic; all others are basic. Consider the nonbasic variable a;|i. From the labeling method described above, a unit increase in a|i results in the following flow changes (coefficients are deltas and gammas) : I 2^31 X32'TX22 2;21 X22~TX21 X\i-\-Xi2 S31-I-S32 + S11 Sl2. MULTICOMMODITY SIMPLEX METHOD 35 Figure 3. Multicommodity transportation basis. Now suppose thatssi leaves the basis; the new basis after pivoting is shown in Figure 4. For this basis, S22 and S31 are nonbasic. Now suppose that xjj is the next candidate to enter the basis. The flow changes are determined as which simpHfies to 3^31 •'J32~r'Cl2 -Cii X3i~rX32 2;22"r"C21~r«';22 •^21|2Jll "^12 3^31 2;32 2;3i-ra;32 3;22"rX2ir2;22 2;2i. A duaUty approach provides an alternate interpretation of the pricing-out operation. We use the transportation problem to illustrate; the procedure is easily extended. The dual constraint complementary to Xij is (3) -Wi*— «/*— <Ti;<cfj with crij>0. By complementary slackness, ii Stj t U, then aij=0. If Xi, is basic, then (4) Ui*' — V,^ — Gii=c'ij. Since the dual system has r more variables than equations, one can easily verify that the dual variables can be obtained as follows. Let tii*=0 for all k. If x'lj e B^, set 1?/= — cfj. Continue labeling across basic arcs in a manner analogous to the computation of dual variables for single commodity problems. Once these are determined, for each Si,^ U, a{j is computed from (4) and from whether or not the cycle traverses any other saturated arcs. For the basis in Figure 4, the dual variables are given in Figure 5. ^22 appears in the expression for (731 since the cycle di includes arc (2, 2) which is saturated. Since arc (2, 2) is traversed in the forward direction, 0-22 is subtracted. "1 »2 x* Figure 4. Result after pivoting. 36 J. R. EVANS Hi ^2r'^ii '^32"'^12 1 , 1 ^22 = - ^22 ■*• ^^21 1 V. -J -c -c 11 12 c + c 11 12 Hi ^^22 *^12 2 2 S2"^12 fT = — c +c — c — c +c 31 31 32 12 22 12 ■^ "21 - ^: = — c +c — c +c — O 31 32 22 21 22 Figure 5. Dual variable computation. The reduced cost for a nonbasic arc is given by (5) For instance, c|i is Cij—U(''+Vj'+(rij. C31 C32 + C12 Cll C3i~rC32 C22~rC21 0'22 which is equivalent to that found in the "stepping-stone" method. To conclude this discussion, we note that a Phase I procedure can be instituted by first ob- taining any set of feasible bases Bi, B2, . . ., B^ (see [2], chapter 17), and adding an artificial variable to each infeasible capacity constraint. To these arcs, associate a label €ij=\. The reduced cost for each nonbasic arc is computed solely with respect to the epsilon labels, with the most negative being the entry criterion. The procedure terminates when all artificials are replaced by nonnegative slack values. SUMMARY We have presented a simplex labeling procedure not unlike those developed for single com- modity problems for certain classes of integral multicommodity networks. Since no explicit matrix inversion is required, the procedure would undoubtedly be more efficient than more general multi- commodity algorithms for this class of problems. REFERENCES [1] Bradley, G. H., G. G. Brown, and G. W. Graves, "GNET: A Primal Capacitated Network Program," Copyright, 1975. MULTICOMMODITY SIMPLEX METHOD 37 [2] Dantzig, G. B., Linear Programming and Extensions (Princeton Press, 1963). [3] Evans, J. R., "A Combinatorial Equivalence Between a Class of Multicommodity Flow Prob- lems and the Capacitated Transportation Problem," Mathematical Programming, 10 401-404, (1976). [4] Evans, J. R., "A Single Commodity Network Model for a Class of Multi-commodity Dynamic Planning Problems," Working Paper, Department of Quantitative Analysis, University of Cincinnati, (1976). [5] Evans, J. R., "A Single Commodity Transformation for Certain Multicommodity Networks," Operations Research (to appear). [6] Evans, J. R., "Some Network Flow Models and Heuristics for Multiproduct Production and Inventory Planning," AIIE Transactions 9, 75-81 (1977). [7] Glover, F., D. Karney, and D. Klingman, "Computational Comparisons of Primal, Dual, and Primal-Dual Computer Codes for Minimum Cost Network Flow Problems," Net- works, 4 191-212, (1974). [8] Hadley, G., Linear Programming, (Addison- Wesley, 1962). [9] Hartman, J. K., and L. S. Lasdon, "A Generalized Upper Bounding Algorithm for Multi- commodity Flow Problems," Networks, 1 333-354, (1972). [10] Johnson, E. L., "Networks and Basic Solutions," Operations Research, 1I^ 619-623 (1969). [11] Johnson, E. L., "Programming in Networks and Graphs," ORC 65-1, University of Cali- fornia, Berkeley, (1965). [12] Kennington, J., "Solving Multicommodity Transportation Problems Using a Primal Par- titioning Simplex Technique," Naval Research Logistics Quarterly ^4, 309-325 (1977). [13] Lasdon, L. S., Optimization Theory for Large Systems, (MacMillan, 1970). [14] Srinivasan, V., and G. L. Thompson, "Accelerated Algorithms for Labeling and Relabeling of Trees, with Applications to Distribution Problems," Journal of the Association for Computing Machinery, 19 712-726 (1972). A LINEAR PROGRAMMING APPROACH TO GEOMETRIC PROGRAMS John J. Dinkel* and Gary A. Kochenberger Department of Management Science and Organizational Behavior The Pennsylvania State University University Park, Pennsylvania William H. Elliott Bell Laboratories Holmdel, New Jersey ABSTRACT A cutting plane method, based on a geometric inequality, is described as a means of solving geometric programs. While the method is applied to the primal geometric program, it is shown to retain the geometric programming duality rela- tionships. Several methods of generating the cutting planes are discussed and ill- ustrated on some example problems. 1. INTRODUCTION Geometric programming (GP) provides a convenient framework for analyzing the class of problems described by positive polynomials (posynomials) [8]. The main feature of GP lies in the duality theory which enables one to solve the primal problem (which may be highly nonlinear) by solving an associated dual program. The dual GP has the attractive form of maximizing a concave function subject to a set of linear constraints [8]. Several algorithms have been reported to solve this dual GP and from that solution, to compute the primal solution via the duality theory of GP [2,6,11]. However, it is known that whenever a primal constraint is inactive (loose) , the associated dual variables are zero and the corresponding gradient terms are undefined. Frank [11] overcame this difficulty by employing a search procedure (Hooke-Jeeves) to optimize the dual GP. With respect to gradient-based procedures. Beck and Ecker [2] developed a convex-simplex method modified (through the use of subsidiary programs) to deal with loose constraints; Dinkel et al. [6] report a modified Newton-Raphson procedure. While the Frank algorithm avoids the problem of loose ♦This author's research was supported in part by a Research Initiation Grant administered through The Penn- sylvania State University. 39 40 J. J- DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER constraints, it encounters increased run times when the degree of difficulty is large, since it appears that the dual objective function (log v(5)) is rather flat. The algorithms described in [2, 6] handle large programs efficiently but certain types of loose constraints may necessitate a restart of the algorithm [2, 6]. Recently, Duffin [7] described a procedure for linearizing the primal GP based on a geometric inequality. We describe a linear programming approach based on this development to solving the primal GP. The method is exterior in that it approaches the optimal solution from outside the feasible region via a cutting plane technique. Such techniques have been proposed by Avriel, et al [1], Kelly [13], and Cheney and Goldstein [3], and modified for convex programs by Hartley and Hocking [12]. These procedures are based on the construction of hyperplane approximations to the feasible region, and as such require gradient information and possibly a grid on the feasible region. The approach to be presented here uses Duffin's geometric inequality to generate cutting planes without recourse to gradient information. As in the above mentioned procedures, we are faced with a sequence of linear programs where each linear program differs from the preceding one by the addition of a cutting plane. 2. LINEARIZING GEOMETRIC PROGRAMS The primal GP, referred to as program A, is: min Qoit) (1) subject to ^t(0<l, k=l, . . ., p 0<Lj<tj<U,<^, i=l, . ..,m where each 9k{t)=j: Ui{t)=j: c, n <?•' i=mk ! = mt j=l is a posynomial (c,^0). The above formulation differs from that given in [8] by the inclusion of upper and lower bounds on the decision variables. These bounds will be discussed later in this section as well as later in the paper. We will find it convenient to replace the objective of (1) by min to and the additional constraints io~Vo(0^1 Q<Lo<to<Uo<^. Following Duffin's development [7], for each g,i, k=0, 1, . . ., p define a set of nonnegative weights e i, mic<i< rit such that ni, x; e,=i. ing these weights we form the monomial program A(e): min tg subject to to~^go{i; «)<1 g>c{t;e)<l, k=\,.. Ljtr<l urH,<i 3 = 0, 1, p m LINEAR PROGRAMMING APPROACH 41 where g,{t; €)= n {u^{t)le,y^ k=0, 1, ...,p i=mt ■k m n f<^i c',= n {cjuy< i=mk O'kr eittij (2) (3) This construction is based on the geometric inequaUty ([7], Lemma 1) ifuC>0 and e<>0, <= 1, n, then (4) (Z:^iY>n {uJed'^X^ \,=i / ,=1 where n X=Xj «i with x^'^^l for a;=0. 1=1 As a consequence of this result it is clear that (5) gk{t)>g,{t- e) where the Ui of (4) are taken to be the terms of Qkit), and the cj of (4) are nonnegative and normal- ized for each ^=0, 1, . . .p. Employing the transformations a=iog c; [7;=iog u, we obtain the linear program -Ai,(e): min Zo m subject to — So+C'o+S 0''c)Zj<Q m Ck+Jla',iZi<Q, k=l,...,p ;=i Lj-Zi<0 -Uj+Zi<0 j=0, 1, . . ., m. We leave it to the interested reader to construct the geometric dual, B{i) to A{e), and the dual linear program, BL{e) to ^i,(e). The following diagrams (Figure 1) provide some useful insights and the basis for the computational procedure. 42 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER CONDENSATION (2), (3) LOG TRANSFORMATION OPTIMALITY GP DUALITY AU) M ^ BU) LOG TRANSFORMATION AlU) LP DUALITY -* *- Bl(£) Figure 1. The above figure is not complete enough to guarantee the global solution to A (which need not be a convex program). Recall that A can be transformed to a convex program via tj=e'> which gives rise to the dual GP B. However, Figure 2 and the following result provide the necessary details for dealing with a convex program. CONDENSATION Az (£) -*- LOG TRANSFORM A(£) LEMMA 1. Figure 2. LEMMA 1: For the primal program (1), the associated linear programs A^Li^) and A^ie) of Figure 2 are equivalent. PROOF: The change of variables tj=e'i,j=l, . . .,m yields A^ [8], p. 83). With respect to (4), let Ci exp I Xi <ii)Z} play the role of Ui and define a set of nonnegative, normalized weights €<. The monomial program is A.{e): min exp {z^ subject to exp (— 2„) 11 ifiiluY' exp 2 (X) ^o^j ) «< <1 n (c,/£,)"exp X)(Z;<Jo2j€J<l k=\,...,p t=mk L ro» \7=1 / J Lj exp i-Zj) Uf^ exp (2;) atj, ZjeB}, Ci>0. <1 <1 i=l, . . ., m LINEAR PROGRAMMING APPROACH 43 Employing the logarithmic transformation described earlier, we obtain the linear program min Zo Tip m subject to —Zo+Co+S) Y) ci,iiZiei<0 n» m i = mk j=l <o k=l,.. ■,p L'j-z, <o -U', + Zj <o i=o, 1, . ., m which can be seen to be Az,{f) using (3). This straightforward result will enable us to obtain global solutions to the primal program directly rather than through the dual as is usually done in GP. Remarks on Upper and Lower Bounds REMARK 1 : The preceding development differs from that of Duffin [7] by the inclusion of bounds on the variables. From a geometrical point of view, the need for these bounds can be illus- trated by the following example ([8] p. 88) : subject to \ h'^ <2~'+t ^2'' ^3~' < 1 t>Q. Forming program -4(e), for some choice of e, yields a program with two terms and three variables. The resulting linear program A^ii) then involves two constraits and four variables (20, Zi, 22, 23) which is insufficient to determine a point in E* {to, h, (2, ts). The addition of the bounds (e.g., 0.001 < tj<200 j=0, 1, 2, 3) yields an LP with 10 constraints. Thus we are able to determine a point in £* with respect to at most four linearly independent hyperplanes. Since these bounds may be artifically imposed, we want to assess their effect on the optimal solution; that is, we need to determine whether or not any of these bounds need to be relaxed in order to guarantee the attain- ment of the optimal solution to the original problem. REMARK 2 : With respect to the artifically imposed lower bounds, if the program is canonical then all artificial lower bounds must be inactive at the optimal solution. The authors have pre- viously described [6] an LP method for testing the canonicality of the dual GP which implies the positivity of all t^ in a minimizing sequence. Thus, if a program is canonical and the algorithm terminates with an active artificial lower bound, that bound must be relaxed and the algorithm restarted. 44 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER REMARK 3 : The presence of upper bounds (articificial or real) on each variable is necessary since even though the dual GP is canonical the primal GP may be inconsistent [8]. The artificial upper bounds provide a convenient indicator of primal inconsistency since the objective function grows without bound. If the program is canonical and we have a feasible point, the algorithm must terminate with all artificial bounds being inactive in order to guarantee the optimal solution to the original problem. REMARK 4 : The one exception to the above criteria are those geometric programs with alternate optimal solutions. As described in [5, 9], such programs may be terminated with one or more of the artificial bounds active at the solution. Since much of later development depends on (4) and (5) , we give some basic properties of the geometric inequality as they apply to the program -4(e). Geometric Inequality Properties PROPERTY 1. The condensation (4) used to form program A{t) is an exterior method. PROOF: For some initial point f , (5) yields 9.it)>9,(t; e") k=0,l,...,p where e''i=Ui{t'')/gic{t°) ■mk<i<n/c for each k. Denoting the feasible solution sets as S=r\ {<:^,(f)<l} and-SCe")- n {< :^.(<; e") <1} t=0 k=0 we have, by (5), S(zS{e°). As an immediate consequence we have that the sequence of solution points to the linear program A^ie) is infeasible with respect to A unless .4 is a monomial pro- gram. Also as a consequence of this result we have PROPERTY 2. If A is consistent then A(e) is consistent. Moreover, Ma(,)<Ma where M^c.) and Ma are the respective minima of programs A{t) and A. These properties will be exploited in the development of the next section. The final result of this section relates the optimal solutions of^and^(e*). PROPERTY 3. Suppose A is superconsistent and (t*, m*) is the optimal solution to ^(e*). Then (t*, n*) is a Kuhn -Tucker solution for A. To prove this result we need two intermediate results: LEMMA 2. g,{t*; e*)=gk{t*) for k=0, 1, . . ., p. PROOF: For each k we have =n fir* («*)•* since €*=Ui{t*)l gk{t*)mk<i<nk nk = gk{t*) since 2 ««=!• We note that the above proof relies on the canonicality of A, which according to Remark 2 has been verified or the problem has been reformulated. LINEAR PROGRAMMING APPROACH 45 LEMMA 3. V^gdt*; e*)=V,gH{t*) ior k=0, 1, . . .,p. PROOF: For each k we have and (6) '-^'t For the ith term, m„<i<ni,, we have from (6) Letting t=i* and €*=Ui(t*)/gk{t*)mic<i<nk we obtain ^^^ "~airU(0^*^^^ J- a^, which is the ■ith term of dg,{t*) To prove Property 3 let, (t*, n*) satisfy the Kuhn-Tucker conditions for A(e*). Then straight forward substitution of the results of Lemmas 2 and 3 shows that (t*, /x*) satisfies the Kuhn-Tucker conditions for program A. The question of the generation of such a point will be treated in the next section. 3. COMPUTATIONAL ASPECTS Figures 1 and 2 reflect the computational procedure to be described: the geometric inequality (4) and subsequent logarithmic transformation are used to form -^^(e) which is then solved via LP duality by solving jBz,(e). The reasons for solving ^^.(t) will be explained later. As a consequence of the results of the last section the procedure is an exterior method; that is, we approach the optimal solution from outside the feasible region. This exterior approach has been developed by Cheney and Goldstein [3] and Kelley [13]. Hartley and Hocking [12] presented a modified approach for convex programs. Avriel, et al [1] developed a cutting plane algorithm for polynomial geometric programs based on Dufiin's geometric inequality [7]. In this section we de- scribe some of the fundamental properties of these methods as well as develop acceleration devices. The section ends with the characterization of the optimal solution of such methods in terms of a Kuhn-Tucker solution to the original problem. If we denote the current operating point as t\ the condensation of a constraint gic{t), k=0, 1, . . ., p about that point is defined as (8) gic{t;e^) where et^=Ui{t^)/gk{t^)mk<i<nic, k=0,l, . . .,p. 46 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER The value of the monomial constraint at the point <' is Since in general t\ the optimal solution to -A£,(e'), is not feasible for A we introduce a cutting plane about t^ with respect to a violated constraint. This cutting plane is then adjoined to the linear pro- gram to form /li,(e'+^). The basic properties of such cutting planes as given by Duffin [7] and Avriei, et al [1] are: THEOREM 1. Let V be the optimal solution to Al{^^) and suppose ^*(<')>1 for some k. Define er' = u,(i')/i/*(i') wit<i<n, and SW+')=SW){\gdt; €'+')• Then VtS{t'+') («' is not feasible with respect to the additional constraint gk{t\ e'^^)). Moreover, <S'cS'(£'+i) (the cut does not inter- sect with the interior of S) . This result follows directly from the geometric inequality and provides the basis for cutting plane algorithms based on the geometric inequality. In an earlier work Avriei, et al [1] describe such an algorithm. This algorithm was based on the condensation of the most violated constraint; that is, condense gi{t) where ^i(0=max;t {gk{t)'- gic{t')^l}. While computational experience has indicated the effectiveness of such an approach, it deals with only one constraint at each iteration. An immediate extension is to consider all violated constraints. That is, if ^a(<')>1 1<^<P. consider the violated constraints. That is, if ^t(i')^l ^^k<p consider the additional constraints {9k(t; e'^'): grt(i')^l}- Following the notation of Theorem 1 we have the important result; THEOREM 2. Suppose gk{t')>l, k=l, . . ., q<p. Define Si,'+')=SW)n(^h^9kit;e'+')y Then t'^S{i'+') and S^S {€'+'). PROOF: The proof is by induction on the number of violated constraints with Theorem 1 proving the result for g=l. Assuming the result for some g>l, by the induction hypothesis we have and from Theorem 1 Hence Also by the induction hypothesis tUSii')n\f\ji{t;e'+')\> g,+^{t';e'+')=^g,+,{t')>l. S^Sie')Olr]^gAt;e'+') and by Theorem 1 g^+iit; e '■•"') <gg+i (t), which is enough to complete the proof. We should note that while the above discussion centers on A{t), we are really interested in ^/.(e). However each g/dt; e') gives rise to a hyperplane in the program ^i,(e) via the linearization described earlier. From a computational point of view, the dual LP Bi,{t) is more attractive than the primal LP. For a primal GP with m+1 variables, p-\-l constraints, and 2(m-t-l) bounds, the associated dual LP has only m+1 rows and 2m-\-p + 3 columns. Also, while the additional constraints (cuts) de- LINEAR PROGRAMMING APPROACH 47 scribed above would require an additional primal LP row, the cuts become columns in ^^(e). Additional aspects will be discussed later. We can illustrate the iterative procedure as follows. Suppose that t' is the optimal solution to ■4i,(e') which is not feasible with respect to A: STEP 1: Using (4), linearize all violated constraints with respect to the current infeasible point t^ as : (9) gK{t; e'+') = n (Uiit)/eiy< where ei=Ui{t')/g,{t')m,<i<n,, where k is such that g/cit')^'!. STEP 2: Form the linear program A^ie''^^) by adding the linearized version of (9) to A^ii') in the form of additional constraints: C*+Sa;,^.<0 where C.=log (2) a; =(3). STEP 3: Solve numerically the LP dual 5i(e'+'). Call the solution t'+^; if t'+^ is not the optima- solution for A, return to Step 1. While the above iterative procedure is stated in terms of the geometric inequality it could be applied to the Kelley cutting plane algorithm by replacing (9) with the appropriate gradient- based linearization. Since the above iterative procedure with most violated or all violated constraints is an ex- terior method, stopping criteria are developed based on the feasibility of the primal solution. With that in mind we have as an immediate consequence of (5) and the previous results: THEOREM 3. M^(e')<M^(£'+')<M^. Moreover, if V is feasible for A for l>0 then t' is optimal with respect to A. Such results are translated into a stopping criterion based on a tolerance for feasibility; for example gidt') — 1<10~® for each k=l, . . ., p. However, as has been demonstrated by past com- putational experience [2, 5, 10], such a stopping criterion may result in values for the variables which vary from the optimal solution. One approach to obtain more accurate primal values is to reduce the feasibility tolerance. Alternately, we propose the use of an additional criterion based on the equality condition for the geometric inequality. This condition ei^'=w<(<')/^*(^') can be used to compare successive values of the €«, and for those outside a specific level, say 10"*, can be used to generate a new cut [5, 10]. Finally we show that the point generated by the above sequence of LP's satisfies the Kuhn- Tucker conditions of program A(e*). THEOREM 4: Suppose A is superconsistent and (2*, \f/*) is the optimal solution generated by A^ii*), then (t*, n*) satisfies the Kuhn-Tucker conditions for ^(€*), where t*=exp (z*), j=0, 1, . . ., m,&nd,i*=t*oitfc=0, 1, . . .,p. PROOF: Let t* be the optimal solution to A and define e*=Ui{t*)/gkii*) for each i and k. By the previous results the solution to A^ie*) is z*=lnt*. Denoting the dual LP variables of AlU*) as ^*, we see from Figure 1 that 4>* solves ^^(e*). Moreover, each constraint oiA(ji*) has associated 48 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER with it xj/t. Thus from [8] we compute the lagrange multipliers for ^(e*) as m*=^*<* for ^=1, . . ., p. Thus {t*, M*) is a Kuhn-Tucker solution for ^(e*). The normality constraint of B{t*) is satisfied since go{t*)to~^ = l by construction. As a consequence of this result we can state the relationship of the dual LP variables and the sensitivity coefficients of primal geometric programming constraints. The sensitivity coefficient of a primal constraint is the sum of the dual LP variables associated with the active linearly in- dependent representations of that constraint. That is, ^* is sum of the dual LP variables associated with the ^th constraint in A^ie*). 4. ADDITIONAL COMPUTATIONAL ASPECTS AND EXAMPLES As noted in the literature [1, 5, 10, 14, 17], cutting plane-type algorithms tend to generate a large number of cutting planes and can become computationally inefficient. Clearly, the dual pro- gram B^ie) is the appropriate program to solve since each cut generates a column (rather than a row in AL(e)). In addition, the cut deletion approaches of Eaves and Zangwill [9] are important to delete redundant cutting planes. To illustrate the effect of such procedures we note that the initial dual LP for Example 1 has 12 columns (8 bounds and 4 linearized constraints), and for the most violated algorithm the final dual LP (21 iterations) would contain 32 columns if all cuts were re- tained. Similarly, the all- violated algorithm would contain 34 columns in the final dual LP (12 iterations) . Since we need at most m linear independent hyperplanes to describe an m-dimensional point we would like to identify loose or redundant cuts, via the dual LP variables, and discard them at the end of each iteration. The recent research of Eaves and Zangwill [9] has investigated the effect on convergence of such procedures. Under restrictions on the effect of dropping cuts on the value of the objective function we can implement such a scheme. More complete details are given in [9, 10] and in terms of Example 1 we can reduce the dual LP to 12 columns at each iteration. As mentioned earlier, it is possible that B is canonical and A is inconsistent; such a situation implies min go{t) = -\- °° . Thus some or all of the artificial upper bounds will be active at the optimal solution to A{e*) or, within an iteration, B^W) will be unbounded, indicating that .^^.(e') is infeas- ible. Thus the upper bounds must be increased in order to ensure that they were not too restrictive to begin with. If the programs were canonical with a single optimal solution, all upper bounds must be inactive. The presence of alternate optimal solutions, at which some variables may be at their upper or lower bounds, dictates the presence of bounds to prevent identifying such programs as infeasible [5]. At this point it is of interest to relate the above method to the previously proposed cutting plane methods [12, 13, 14, 16, 17]. We should note that the algorithm proposed here is initialized in a slightly different form; that is, the initial LP consists of the linearization of gk{t; e") where e" is chosen as the centroid of the bounds. All of the above mentioned methods initially solve min <„ subject to Lj<t,<U j, j=0, 1, . . ., m. It is clear that the algorithm proposed here could be initialized in this manner as well. With respect to the particular cutting plane method we note: Kelley's Convex Cutting Plane Method. As originally stated [13, 16], this method applied to linearly constrainted problems; how- LINEAR PROGRAMMING APPROACH 49 ever, the method has been extended to arbitrary convex functions [14, 17]. This method constructs a cutting plane at the current operating point t, based on the gradient of the most violated constraint at that point. This new constraint is added to the LP and the problem is resolved. (Zangwill [17] gives a slightly different version of this algorithm.) Thus the Kelley Cutting Plane Algorithm and the above algorithm based on the most violated constraint are similar except that the lineari- zation of the violated constraint is based on the geometric inequahty (4) rather than on gradient information. Recently, the Kelley method was extended to consider all violated constraints [10]. Convex Algorithm. Hartley and Hocking [12] presented a modification of the above cutting plane method. While this method reduces the number of LP's to be solved, it requires the construction of a tangential supporting hyperplane. This construction of the hyperplane is based on the evalua- tion of the constraint (functions) at grid points imposed on the n-dimensional space. Supporting Hyperplane Cut. Luenberger [14] and Zangwill [17] describe a method of generating a cutting plane that supports the feasible region of the original problem. While similar to the Hartley and Hocking approach, it replaces the grid with a line search. It is clear that this method generates deeper (i.e., tangent) cuts than the above methods; however, it does require a one-dimensional search and gradient components as well as a feasible starting point. The line search is used to determine the point of intersection of the line joining the feasible point and the current (infeasible) point with the original constraints. A comparision of the computational efficiency of these methods is reported on elsewhere [5, 10]. In order to illustrate the cutting plane algorithm based on the geometric inequality, these most violated and all-violated algorithms were coded in FORTRAN IV. The algorithm used the linear programming routine of Clasen [4] to solve the dual LP at each iteration. Two sample pro- blems were taken from the recent study of Beck and Ecker [2] and the program was implemented on an IBM 370/165 at The Pennsylvania State University. EXAMPLE 1. This example was taken from the recent study of Beck and Ecker [2]. The original version is min bt,H2-Hz-\-\mrH2Hz-'' subject to O.Ur'^2"''+0.2^2-'i3~'<l 0.375<,-' <3+0.625<r43-^ < 1 00, which is transformed to: min to subject to the additional constraint 5<<,~^<i^<2~'<3+10<<,~'^i~*<2^<3"^<l. Since nothing is assumed about the problem, the upper and lower bounds were arbitrarily set at C/j=1000, Lj=0. 00001 forj=0, 1, 2, 3. The alogrithm was then initialized at the center of this hypercube ^^=500 for each ji, which is an infeasible starting point. Table 1 summarizes the solution procedure for both the most violated and all violated algorithms. In addition. Table 1 presents the results of the algorithms with a feasible starting point. 50 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER Table 1. Computational Results for Example 1 Infeasible Starting Point {tj= 500i=0, 1, 2,3) Itera- Most Violated Constraint Itera- All Violated Constraints <ot g<.(Ott <7,(0 gtit) <„t ?o(Ott ?.(0 Qiit) tion tion 500 2500 . 000001 .375 500 2500 . 000001 .375 1 . 00001 2. 2X10'« .0019 29. 393 1 . 00001 2. 2X10" . 0019 29. 3929 2 4. 9701 2. 1. 1217 1. 9718 2 . 65809 1. 9999 8883. 5742 328945. 187 3 8. 2281 1.25 1. 1217 1. 7553 3 8. 0557 1.25 1. 4155 1. 8575 4 9. 6623 1.25 1. 0007 1. 1781 4 11. 0351 1. 0521 .8764 1. 1995 5 10. 8730 1.052 1. 0039 1. 2721 5 11.6585 1. 0134 .5477 1. 0437 6 11.0611 1.052 .4883 1. 0578 6 11. 7264 1. 0032 .3962 1. 0107 7 11. 1482 1.052 .3540 1. 0140 7 11. 7598 1. 0008 .3386 1. 0027 8 IL 6817 1. 0134 .3979 1. 0140 8 n. 7753 1. 0002 . 3520 1. 0006 9 11. 7272 1. 0134 . 33996 1. 0035 9 11. 7775 1. 00005 . 3442 1. 0002 10 11. 75109 1. 0032 . 32023 1. 0035 10 11.7778 1. 00001 . 33800 1. 00004 15 11. 77568 1. 0002 . 33207 1. 00005 11 11. 7780 1. 00000 . 33493 1. 00001 20 11.77809 1. 00000 . 33402 1. 00000 12 11. 77807 1. 00000 . 33363 1. 00000 21 11. 77811 1. 00000 . 33472 1. 00000 13 11. 77808 1. 00000 . 33438 1. 00000 22 11. 77812 1. 00000 . 33501 1. 00000 14 11. 77810 1. 00000 . 33468 1. 00000 Execution Time .6 s-3Conds (IBM 370/165) .43 seconds. Feasible Starting Point <= (14.2389, 1.1121, .9066, 1.4306) Iteration Most Violated Constraint Iteration All Violated Constraints to goit) giit) g2{i) to goit) giit) <72(0 14. 2389 . 8903 . 2534 . 8752 14. 2389 . 8903 . 2534 . 8752 1 6. 3911 21. 287 . 3429 1. 6446 1 6.3911 21.287 . 3429 1. 6446 2 9. 7413 1. 4571 . 8746 1. 3969 2 9. 8693 1. 4571 . 4197 1. 1099 3 10. 7658 1. 0991 . 7236 1. 1859 3 10.9011 1.0991 . 3652 1. 0257 4 10. 8790 1. 0991 . 4075 1. 0413 4 11.4669 1.0249 . 3458 1. 0063 5 11. 3997 1. 0250 . 4740 1. 0413 5 11.7716 1.0061 . 2892 1. 0016 6 11.4556 1. 0250 . 3647 1. 0101 6 11.7749 1.0015 . 3140 1. 0004 7 11. 7373 1. 0061 . 3377 1. 0101 7 11.7766 1.0004 . 3273 1. 0001 8 11. 7657 1. 0061 . 2969 1. 0025 8 11.7775 1.0001 . 3342 1. 00002 9 11. 7680 1. 0015 . 3085 1. 0025 9 1 1. 7778 1. 00003 . 3340 1. 00001 10 11. 7735 1. 0015 . 3255 1. 0006 10 11. 7780 1. 00000 . 3348 1. 00000 15 11. 7778 1. 00000 . 3316 1. 00000 11 11.7781 1.00000 . 3349 1. 00000 17 11. 7780 1. 00000 . 3348 1. 00000 18 11. 7781 1. 00000 . 3359 1. 00000 19 11. 7781 1. 00000 . 33495 1. 00000 Execution Time .57 seconds .47 seconds tObjective function value, tt Corresponds to the constraint fir„(0<o~'< 1- Optimal solution <*= (11.7781, .968245, .771131, 1.29054). LINEAR PROGRAMMING APPROACH 51 EXAMPLE 2. This example presents a larger problem (24 variables) and illustrates the efficiency of the algorithm when the bounds on the variables are very restrictive, that is, 0<<;<1 for each j. The problem statement is : +27.4<3"-''+179<r''+91.5^f-'+120/8-^'+86<n-^«+152^r2" +27.4<r5«^+179^-6''+91.5^r8-^°+120«2"o''+45.9i23«+152<2-,"} subject to: .2891 2/1 <1 .2609 2/i + .3112 2/2<l .2487 ?/i + .6034 2/2 + .0289 2/3 < 1 .2016 1/1 + 1.0360 2/2 + . 0777 1/3 + . 0934 y^Kl .1946 ?/i + 1.0560 2/2 + .0799 2/3+-0986 y4 + .0028 i/5<l .1708 2/1 + 1.0330 2/2+.0799 y3 + .1032 2/4+.OO79 2/s+.0034 2/6<l .15 tf^ tjU ti+2<l, j=l, 5, 9, 13, 17, 21 .&5tj-'<l j=l, 5, 9, 13, 17, 21 tj<l i=l, . . ., 24 where Ci=19.4, C2=16.8, ai= -1.47, a2= -1.66 2/1 = ^1^2^3^4 2/4=^^13^14^15^18 y2=^tbUhi% 2/1=^^17^18^19^20 2/3 = ^9^10^11^12 2/1 ^='21^22^23^24- The objective function was bounded by l<f(, <50,000. The starting point used was ^^=0. 825, j=\, 5, 9, 13, 17, 21, and <^=0.5 for all other j, which is feasible. Table 2 summarizes the computa" tional results for this example. With regard to these results the feasibility tolerance was set at 10~®, and once this tolerance was attained the procedure following Theorem 3 was used to determine the stability of the decision variables. As can be seen from Table 2, additional iterations were performed to bring the variables within this new tolerance, 10~^. Table 2. Computational Results jor Example 2 Result Cuts generated By Most Violated Constraint Cuts Generated By All-Violated Constraints Number of cuts to optimality. 23 5.297 14 3.547 22 5. 104 13 3. 157 Execution Time (Seconds) _... . . ._. Number of cuts to first feasibility Execution Time (Seconds) Optimal Solution: «*= 0.944445, <* = 0.938679, <?= 0.666648, t*=\ for all other j S(,a*) = 1504.297388 52 J. J. DINKEL, W. H. ELLIOT & G. A. KOCHENBERGER 5. SUMMARY AND CONCLUSIONS The previous examples point out several additional computational aspects of the algorithm. 1. Since the upper and lower bounds were arbitrarily set, the infeasible starting point (the centroid of these bounds) was not a good one. However, as the results from Table 1 indi- cate this did not appear to effect the algorithm's performance when compared to a feasible starting point. Preliminary results from a more detailed study [5, 10] indicate more sub- stantial savings when a feasible starting point is used. However, the generation of such a point can be difficult. At any rate, an attractive feature of the above algorithm is that it does not depend on a feasible starting point. 2. While the above examples are small, they do illustrate the advantage of the method of all- violated constraint with respect to number of iterations and execution time. It is expected that with larger problems the results will be more dramatic. 3. The computational results presented here are a function of the LP algorithm; we used a version of an LP code available from Rand [4] on the dual program 5x,(e). More sophisti- cated LP routines, such as revised simplex or dual simplex, would almost certainly result in increased efficiency. The above remarks point out areas of computational improvement if the algorithm is to be competitive in the solution of large problems. Finally, it should be noted that the algorithm is a primal-dual algorithm in both an LP and GP sense. The GP duality is explicitly exploited by Theorem 4, and thus we obtain the valuable dual information as a result of the computational procedure. BIBLIOGRAPHY [1] Avriel, M., R. Dembo, and U. Passy, "An Algorithm for the Solution of Generalized Geo- metric Programs," International Journal for Numerical Methods in Engineering, 9, 149- 168 (1975). Also presented at 41st TIMS Meeting, Houston, Texas (April 1972). [2] Beck, P. A., and J. G. Ecker, "Some Computational Experience with a Modified Convex Simplex Algorithm for Geometric Programming," Journal of Optimization Theory and Applications, l/j. (6) (December 1974). [3] Cheney, E. W., and A. A. Goldstein, "Newton's Method of Convex Programming and Tcheby- cheff Approximation," Numerische Mathematik 1, 253-268 (1959). [4] Clasen, R. J., "Using Linear Programming as a Simplex Subroutine," Rand Document P-3267 (November 1965). [5] Dinkel, J. J., W. H. Elliott, and G. A. Kochenberger, "Computational Study of Cutting Plane Methods in Geometric Programming," Mathematical Programming, 13, 200-220 (1977). [6] Dinkel, J. J., G. A. Kochenberger, and B. McCarl, "On the Numerical Solution of Geometric Programs," Mathematical Programming, 7 181-190 (1974). [7] Duffin, R. J., "Linearizing Geometric Programs," SIAM Review 12, 211-227 (1970). [8] Duffin, R. J., E. L. Peterson, and C. Zener, Geometric Programming — Theory and Applications (John Wiley & Sons, Inc., New York, 1967). [9] Eaves, B. C, and W. I. Zangwill, "Generalized Cutting Plane Algorithms," SIAM Journal of Control, 9, 529-542 (1971). LINEAR PROGRAMMING APPROACH 53 [10] Elliott, W. H., "Primal Geometric Programs Treated by Sequential Linear Programming," Ph.D. Thesis, The Pennsylvania State University, (August 1976). [11] PVank, C. J., "An Algorithm for Geometric Programming," in Recent Advances in Optimiza- tion Techniques, A. Lavi and T. P. Vogl, Eds., pp. 145-162 (John Wiley & Sons, New York, 1965). [12] Hartley, H. O., and R. R. Hocking, "Convex Programming by Tangential Approximation," Management Science, 9 (4) 600-612 (1963). [13] Kelley, J. E., "The Cutting Plane Method for Solving Convex Programs," SIAM Journal on AppUed Mathematics, 8 (4) 703-712 (1960). [14] Luenberger, D. G., Introduction to Linear and Nonlinear Programming (Addison-Wesley, Reading, Massachusetts, 1973). [15] Wilde, D. J., and C. S. Beightler, Foundations of Optimization (Prentice-Hall, Englewood Cliffs, New Jersey, 1967). [16] Wolfe, P., "Accelerating the Cutting Plane Method for Nonlinear Programming," SIAM Journal on AppUed Mathematics, 9 (3) 481-488 (1961). [17] Zangwill, W. I., Nonlinear Programming: A Unified Approach, (Prentice-Hall, Englewood Cliffs, New Jersey, 1969). ON THE AGGREGATION OF PREFERENCES Walburga Rodding University of Dortmund Federal Republic of Germany Hans H. Nachtkamp University of Mannheim Federal Republic of Germany ABSTRACT The article attempts to show how network theory may be apphed to gain new and better insights into basic economic problems. Starting with a precise definition of what is meant by acting and, in particular, by economic acting, we direct the hne of argumentation toward solving the problem of how to aggregate economic decisions. Results indicate that network theory might well prove itself to be a powerful instrument in developing a theory of human behavior much more compre- hensive than currently used models. 1. INTRODUCTION The following study of the application of network theory to economics has several motivations. The first is dissatisfaction with the axiomatic acrobatism of current preference theory. Some of its axioms seem more an adaption of reality to mathematics than an attempt to provide a good pic- ture of actual human behavior. The axiom of complete ordering has already been criticized very intensively (Churchman 1961, [4]). The assumptions that the range of alternatives, the oppor- tunity set, is connected, i.e. that the agent may choose from an uncountably infinite set of alterna- tives, and that preferences are also continuous, are even harder to justify. Clearly no one considers fragments of seconds in working out the division of his time into work and leisure, or ponders whether to buy 1 ounce or 1.000 ... 01 ounces of, say, sugar. Further, even if "objectively speak- ing" there is a continuity of possible choices, the agent does not see it as a continuous infmity, and it is with his perceived range of alternatives alone with which the theory should be concerned (Koch 1973 [7]). The set of alternatives of an agent, as perceived or even in reality, is a discrete surrounding of possible or at least conceivable choices. (A very good example of the latter case is dealt with by Morgenstem 1948 [10].) Axioms like those of completeness and continuity are vital for the existence of order-preserving real-valued utility functions, but they do violence to observations of actual behavior, thereby blocking at the outset ways to our understanding of it. 55 56 W. RODDING & H. H. NACHTKAMP Secondly, there seems to be a lack of understanding of aggregation in economics. "The belief that the particular way in which aggregates arise is unimportant underlies much of contemporary economic theory (e.g., Keynesian theory . . .). It involves an idea of simplification which falsifies the very inner structure of economic problems and phenomena." (See Morgenstem 1948, [10], see also Morgenstern 1972, [12].) We think that aggregation processes are accurately portrayed neither by the operation of merely summing up, e.g. as done in supply and demand theories, nor by voting. Voting may be an instrument by which an aggregate preference, if one exists, becomes visible; it does not, however, constitute an aggregate preference. Aggregation is, on the contrary, something resulting from the particular kind of structural interdependence of agents. The same holds not only with the aggregation of persons, but with aggregation over time as well. The principles of temporal aggregation underlying a dynamic theory based upon the theories of differ- ence or differential equations are either too simple or quite unclear. The theory of games makes significant contributions to the problem of aggregation. However, it uses traditional preference theory and, for dynamic extensions, difference or differential equations. The fact that until now economists have been unsuccessful in basing macroeconomics on firm microeconomic concepts spotlights the failure to solve the aggregation problem, which has often been merely bypassed with resignation (Green 1964, [5]). This has led us to ask, as the third impetus to our research work, if economic theory has possibly been founded through inadequate instru- ments. At any rate it seems remarkable that a theory claiming to explain human acting has not yet incorporated acting into its established scientific language. Consequently, we begin by trying to give a precise notion of what is meant by acting. This is done by making use of a logical construction called a Mealy-type automaton as a general model of an agent. Acting is conceived of as a transition from one situation to another (von Kempski 1954 [6]), so that a theory grounded in this way is essentially dynamic (Section II). Economic action has always been understood as the selection of the best from a given collection of available alternatives. Thus we must demonstrate what realizing a preference structure through an auto- maton means (Section III). Since we require only that a preference ordering be transitive in an ele- mentary way (to be defined more closely later) and drop completeness and, in particular, continuity of alternatives and preferences, the formation of a numerical utility function of the traditional kind is ruled out. An important instrument is thereby lost which was capable of producing a "workable" theory. If certain commodities can be bought only by the ounce or the whole piece, then depicting pairwise comparisons of obtainable commodity bundles presumes an extensive combinatorial reasoning, unless a new instrument can be found to simplify analysis. In fact, not even the most efficient and up-to-date computer is capable of playing through, for example, all possible moves of a chess game, much less all profiles of acting in a society, within a forseeable space of time. Chess pieces, however, are given clear values depending on the combinatorial op- portunities open to the chessplayer which lead to checkmating his opponent. In a similar way, the "strategic" value of a slice of bread for a household or of a certain stockpile for an enterprise depends on the possibilities open to the agent to make use of them and, in addition, on his place in a social framework, i.e. on the degree of influence that other agents can exert on his actions and that he can exert on theirs. Thus our "model of man" can interact with other similar beings, who are then able to bring about a change in his situation and vice versa. The logical instrument for depicting a social framework is a network of automata (Section IV). It will be shown that ag- gregation based on the structural combination of automata into networks ensues in a very simple AGGREGATION OF PREFERENCES 57 and completely natural way. Since, according to the inductive definition of networks, every auto- maton is a network and every network is an automaton, any agent at every stage of aggregation may be regarded as a microunit and, at the same time, as a macrounit by interpreting the model of this agent as an automaton or a network, respectively. Thus in the example below, a certain model corresponds to A, if regarded as an automaton, and to B, if regarded as a network. a faculty of a human being (intellect, memory, metabolism, etc.). a person. a collective. a collective of collectives of ... of collectives. B an aggregate of subfaculties of this faculty. an aggregate of faculties of this person, a corresponding aggregate of persons, a corresponding aggregate of aggregates of of aggregates of persons. Obviously, a new concept of macroeconomic theory can be hereupon embraced, a theory allowing transitions from mircroeconomic to macroeconomic reasoning without loss of information about the internal aspects of macrounits. In Section IV, part 2 this principle of aggregation is related to preference theory. There are two results. The first is a statement akin to the famous Impossibility Theorem by Arrow (1951) [1]. The second is concerned with the Theory of Revealed Preferences, and ensures that conclusions derived from an agent's observed behavior may lead to thoroughly wrong ideas concerning his preference structure. In Section V a theorem will be developed which may be conceived as a first step towards exploring the aggregation of preferences; it shows that preference theory is more comprehensive than has seemed to be the case until now: any agent's acting in accordance with definite, though by no means invariant, maxims may be simulated by a network of elementary automata, part of which realize an elementary preference structure. II. THE GENERAL MODEL OF AN AGENT 1. Modelling Human Acting Actions may be distinguished according to whether they are caused from the outside world (i.e. the agent alters his situation on receiving information) or not (i.e. the agent autonomously changes his situation). In addition, actions differ as to whether or not the outside world may obtain information from the agent. Thus there are at least four types of actions, as depicted in Table 1 below : Table 1 . Types of Actions Information to the outside world Transition yes no Caused from the outside world type 1 type 2 Autonomous type 3 type 4 58 W. RODDING & H. H. NACHTKAMP Consequently, the model of man must be able to receive and to disseminate information as well as to make transitions from one situation to another. Therefore, the structure of this model is characterized by a set X.= U Xt l<i<m of inputs (pieces of information which it can receive) , a set Y:= U F, l<;<n of outputs (pieces of information which it can disseminate), a set Z of possible internal states, and three so-called transition functions, a, /3, y, which denote connections between inputs, internal states, and outputs. The model may be able to receive inputs or to send outputs via several input or output channels. Correspondingly, Xt is the set of all inputs which the structure can receive via the ith input channel (i=l, 2, . , ., m), and Yj, the set of all outputs it can disseminate via he jth. output channel (j = l, 2, . . ., n). The disjoint union of the Xt is denoted by X, the disjoint union of Yj by Y. A structure described in terms of {X, Y, Z, a, /3, 7) is called an abstract automaton. As X, Y, and Z are finite sets, the structure here used is & finite automaton. The transition relations a, ^, 7 and their relative products (i.e. their concatenations) are used to model the four types of transitions listed above. The function a governs the behavior of the automaton on receiving bits of information. When the automaton receives an input in a certain initial state it turns to a resultant state. The function /8 takes into account the automaton's ability to alter its internal state without communicating with the external world. Finally, the function 7 is concerned with the automaton's disseminating information to its environment. Whenever the automaton is caused by its given state to send a bit of information— generally accompanied by a change of its internal state — this is depicted by applying 7 to the given state. Resultant states or ordered pairs of an output and a resultant state are not uniquely deter- mined by applying a to an ordered pair consisting of an input and an initial state and /3 or 7 to the initial state. The image of an ordered pair (a;, z) ^Xy^Z by a and the transform of a ^ Z by /3 or 7 are rather a subset of the set Z of states or of the set Yy^Z of ordered pairs consisting of an output and a resultant state. For, firstly, preferences can influence acting only where the agent is free to choose between alternatives. Therefore, the automaton should not be bound to one and only one transition from a given initial state. Secondly, experience has shown that theories in the frame of the social sciences are not capable of predicting the actions of human individuals or groups in a unique way. Consequently, it seems obvious to take account of this indeterminacy in intro- ducing the means into the formal language by which the behavior of agents is to be described, i.e. to use an adequate notion of transition functions. Thus we employ indeterminate automata as a concept of acting. Specifically, the function a is a transformation of the set Xy^Z (the cartesian product of X with Z) into the power set of Z: ((..L^')^^)^^<^ By the function /3, a subset of Z is associated with an element z^Z: 0: Z-^P(Z) AGGREGATION OF PREFERENCES 59 Finally by the function y, a subset of the set FX-Z of all ordered pairs {y, z) consisting of an output and an inner state is the transform of an inner state : Application of the functions a, /?, 7 causes an a-, /3-, or 7-transition, respectively. Lack of information concerning the behavior of the agent in question may lead to • ordered pairs not belonging to the domain of a, or to • states zeZ not belonging to the domains of ^ or 7. Thus th3 function a tells nothing of how the automaton reacts upon receiving the given signal x in the given state z; /3 and 7 provide no information about how the automaton behaves when it is in the given state z. Such incomplete automata are allowed as an accommodation to fragmentary knowl- edge of an agent's behavioral pattern or even to agents not fully aware of their behavioral opportunities. It is easy to see that the four types of acting listed in the above scheme can be depicted by a, |8, 7, or their relative products as follows : type 1 by q:-/3''-7, with r a finite nonnegative integer, type 2 by a-^', type 3 by fi'-y, type 4 by ^'^'^ where j3'':=/3-(8 . . . ^ (r times), a-^''=a, and i8''-7=7. This model of an action unit, an indeterminate finite and potentially incomplete automaton, is the "agent", the "model of man" of that microeconomic theory whose basis is to be developed here. 2. Some Technical Developments The (a, /3, 7) -portrait of acting has been introduced because of its intuitive appeal. For practical reasons we now turn to the use of ordered quadruples (x, z, z', y) as a notion of what is meant by acting. The intuitive interpretation of such a quadruple is, in our preceding terminology, as follows: An automaton which is in state 2 first receives the input signal x via an a-transition, then performs a finite, non-negative number of j3-transitions, and finally sends the output signal 1/ by a 7-transi- tion, with z' the resultant state. According to this view, an automaton is an ordered quadruple {X, Y , Z, U) with IJ ^XyiZy^ZyiY . It can be graphically represented by means of a black box with input and output channels for a standard signal. The latter means that the set of signals avail- able to the automaton for inputs and outputs consists of one element only. The restriction to a stand- ard signal does not limit the possibilities of modelling acting units, since a one-to-one function between the set of n signals which can be transmitted on one channel and a set of n channels on 60 W. RODDING & H. H. NACHTKAMP which a standard signal can be transmitted is easy to establish. Consequently, X now denotes the set of input channels, and Y the set of output channels. The use of automata is normalized as follows: An automaton can receive an input only after it has sent an output. (In section VI a hint will be given as to how such automata can be adapated to agents who first receive a succession of several inputs and only then give one or more outputs). Thus, experiments with automata result in finite sequences starting with an input and terminating with an output. To be more precise, experiments are finite sequences of ordered quadruples {xu 2i, z'u Vi), i=l, 2, . . .,n, with Zt+i = z't. Sometimes one takes into consideration the behavior of an automaton concerning the external world only, thereby neglecting the changes of its state. An adequate term of the formal language used here is the notion of a protocol. Protocols are the sequences of ordered pairs (x^, y<) defined in accordance with experiments of the above type. The set Prat (z) oj protocols associated with a state z is the set of protocols of all experiments possible with the automaton having the initial state z. Suppose an automaton is in state 2,,. Then Prot (2o) consists of the complete list of ways in which the automaton may communicate with its environment. III. A MODEL OF ECONOMIC ACTING Having considered acting in general, let us now turn to the concept of economic acting, i.e. acting in order to reach certain goals. Each theory aiming at an explanation of economic acting necessarily starts by establishing a principle of valuation. Though the analysis of how preferences originate is an essential task of our future research, we will accept for now the principle that people realize a preference ordering. What, then, is meant by the statement that an automaton A realizes a preference structure P? 1. The Meaning of an Automaton Economically Acting The first step is to clarify what objects are compared by our "model of man", given his aims. With respect to traditional economic theory, we think it reasonable to order states rather than actions. Actions are the means by which the one or the other state can be arrived at. It is the states that are compared by an acting unit, which must judge "This state is better than that one", or, "Both states are equally good (or bad)." Realizing a prejerence structure P by the automaton A requires, in analogy to human acting, that certain of the original experiments with A are no longer possible. This is accomplished by not allowing certain transitions of A. Thus an automaton A, by realizing a preference structure, shows an "impoverishment" of its behavioral pattern be- cause, given its initial state, it can no longer select any state from the set of possible resulting states, but has to choose according to its preferences. Intuitively, an automaton with preferences has less freedom of choice, i.e., is "less indeterminate", than an otherwise similar automaton with- out preferences. Naturally, the economically acting agent does not lose any capacity for receiving and sending information; he does not become stupid and his opportunity set is not affected by his "rational" acting. Accordingly, the sets of inputs, outputs, and states remain unaltered if the automaton realizes a preference structure. Moreover, whenever an agent is able to act in a given AGGREGATION OF PREFERENCES 61 situation, he can also act if he is realizing a preference structure, since realizing a preference struc- ture does not cancel the ability of doing something. Thus, the realization of a preference structure by an automaton A generates a subautomaton B, BczA. In contrast to usual definitions of sub- automata, B differs from A only as to the number of transitions ; A can perform all transitions open to B while the contrary does not hold. However, B has the same sets of inputs, outputs, and states as A. The precise notion of a subautomaton as used in this paper is given in the following definition: DEFINITION 1:B={Xb, Yb, Zb, Ub)(^A: = {X^, Y^, Z^, f/^,)iff allof the following condi- tions hold : (a) Xb=^Xa, Yb^^Ya, Zb^Zx, (b) Ub<=^Ua, where (bl) C/^:= U U UlUxu Zj)); Ur^{{xu Zj)): = {{xu Zj, z' , 7/)U, L=A, B i J (b2) V((x, z)^XAXZA)iUA{x, z))9^0-^Ub{{x, 2))?^0), where denotes the empty set. 2. A Precise Notion of an Automaton Realizing a Preference Structure We are now in a position to define the intuitive idea that automaton A, in selecting its transi- tions, takes into account the preference structure P. In the context of automata, satisficing be- havior seems more natural than optimizing behavior, for, if optimal choices are admitted, it may turn out that an automaton will never do anything again after its first transition. Human life, however, means passing through a sequence of states rather than resting upon the merits of the first choice. Therefore it is assumed that an automaton which is in a certain initial state z will, if possible, on the receipt of a certain input x choose a transition from those open to it that will bring it into a better state than 2 ; if it cannot, it will at least try to get to a state not worse than z. Cases in which an automaton is forced to a worse state are left to further investigation. To make ideas precise, a preference structure is defined to be a pair (Bp, Rp), where Rp is a binary, transitive, and irreflexive relation (for definitior^, see [3], pp. 197-198) on the finite set Bp. Consider a partition K of the set Z^ of ^'s states with classes k{z), and a mapping p, defined for this partition with values in Bp. One state z^Z^. is said to be better than another state z' ^Z^., iff (p{k{z')), p{k{z)))^Rp. As a classification of what is meant by an automaton A realizing a preference structure P and thereby generating a subautomaton B, we provide DEFINITION 2 : The triple (A, B, P) is termed admissible for automata A, B and a preference structure P:=(Bp, Rp) iff all of the following conditions hold: (a) B<^A, (b) there exists a partition K with classes kiz), of the set Z^ of states of A, and an (injective) mapping p oi K into the finite range Bp on which a binary, transitive, and irreflexive relation Rp is defined, (c) Ub:=\J U UbUxuZj)); i J \UA{(x„Zj)),iiBA(Xu2j))=<i Ub({xuZj)):=^\ [BA{{Xi, Zj)) otherwise 62 W. RODDING & H. H. NACHTKAMP where Ba{{xu Zj))<z:Ua{{xi, Zj)) such that BAixu z,)):={{xu 2i, z', y}\(p(k(z')), p{k{z;)))^Rp] U \{xuZi,z',y)\ p{k{z')) = p{k{z))A {{xu z,, z', y)\{p{k{z')), p{k{z;)))^Rp} = i»}. 3. The Relevance of Realizing a Preference Structure At first glance the question may arise as to whether any subautomaton A' of an automaton A according to definition 1 could be conceived of as being generated by A realizing a preference structure, in which case realizing a preference structure would be a trivial concept. That this is not so is shown by the following proposition. PROPOSITION 1 : There is at least one pair {A, A') of automata, AczA', such that no prefer- ence structure P exists to make (A, A', P) admissible. Proof: This is given by quoting a simple example (see Figure 1). namely a pair {A, A') of automata such that (a) A'czA, (b) no P exists to make (A, A', P) admissible. Automata A and A' have the same sets of input channels, output channels, and states, set of inputs X= {0} set of outputs Y= {1, 2} set of states Z=\ai, ai} ^^ O-H A; A' I 2 Figure 1 With both automata, the use of output channels is determined by the state resulting from a transi- tion: when the resultant state is au the automaton uses channel i{i—\, 2). Automata A and A' differ from one another in their number of possible transitions only. While automaton A may remain in the initial state or alter its state on receiving the standard signal via the input channel 0, automaton A' always sticks to its initial state. Obviously, automaton A can make every transi- tion open to A' , but not vice versa: A'<^A and A' 9^ A. Is there now a preference structure P such that the behavior of A' is the same as that of A realizing this preference structure? To answer this question, first note that in this simple example there are three possible preference orderings of the set of states : CASE 1 : fli is preferred to a2. CASE 2 : tti and az are indifferent or incomparable. CASE 3 : ttz is better than ai. AGGREGATION OF PREFERENCES The possible transitions are listed in Table 2 below. Table 2. Possible Transitions 63 Transitions Open to A Open to A' Without Realizing preference structure p.s. 1 p.s. 2 p.s. 3 (0, a,, a„ 1) yes yes yes no yes (0, a„ «2, 2) yes no yes yes no (0, aj, a., 1) yes yes yes no no (0, as, aj, 2) yes no yes yes yes It is immediately apparent that none of the subautomata generated by A realizing a preference structure behaves like A'. Therefore A' results from .A by a process different from the realization of a preference structure. The latter is, consequently, nontrivial. IV. NETWORKS FOR DEFINING SOCIAL INTERACTIONS* As was pointed out at the beginning of this article, stress is laid upon the fact that aggregation processes in theory have to depict aggregation processes in reality. We maintain that aggregation takes place through social interactions; i.e. persons come to establish a group of persons and groups of persons come to establish a group of groups by an exchange of information or, at the margin, by merely existing "side by side" without any connections between them. It is this type of interpersonal connections that characterizes an aggregate. The ways persons use to communicate mutually (or refrain from doing so) are essential, for example, to answer the question of whether there exists an aggregate preference, given the individual preferences. Moreover, if one thinks of a natural person not only as an individual but as a more or less complex structure of faculties aggre- gated in a person, and if human faculties are conceived as built up by subfaculties (the latter by sub-subfaculties etc.), an individual preference may also be taken as arising from the peculiar process of aggregating the "preferences" of human faculties, subfaculties, sub-subfaculties, etc. (Speaking about preferences of subhuman beings may seem a bit strange at first glance. But the behavior of, for example, a mimosa which has been touched may nonetheless be interpreted as guided by a preference structure, though possibly a very elementary one.) The idea behind our research is this: the laws governing individual preferences which arise from more elementary *The network-theoretical foundations of this article originate from Professor Dieter Rodding and his "network- team" at the Institut fiir mathematische Logik and Grundlagen forschung (Institute of Mathematical Logic and Foundations Research), University of Miinster, the members of which were P. Koerber, T. Ottmann, L. Priese, and M. Winkelmann. For a detailed definition of normed networks see Priese-Rodding [15] 1974. 64 W. RODDING & H. H. NACHTKAMP preferences of sub- and sub-subf acuities, etc. are similar to those along the lines of which individual preferences are to be aggregated to a preference of a group. Therefore it seems useful to investigate the aggregation of highly elementary preference structures, the most elementary being those which are defined on a set of two alternatives only. Before tackling the problem of aggregating preferences, however, an exact definition is needed of what is meant by an aggregate. The adequate word for an aggregate in the formal language presented here is a "system of automata", i.e., a "network". 1. The Definition of Networks A network is given by the total of automata belonging to it and the statements about the connections existing between them. A connection between two automata results from the identi- fication of an output channel of one automaton with an input channel of the other. At any point in time the internal state of a network is given by the vector of the internal states of its automata. The following inductive definition gives a precise notion of what is meant by a network. It makes use of two combinatorial processes: one in which (possibly very elementary) automata or networks between which no connections exist are conceived as a network (A'-}- A"; see Figure 2), and a second in which an originally free input channel becomes connected to an originally free output channel (.4^.^; see Figure 3). Figure 2 Figure 3 DEFINITION 3: (1) Every automaton is a network. (2) \iA': = {X',Y',Z\ U'),U'(^X'XZ'XZ'XY', and^": = (Z", Y" , Z" ,U"),U"(^X" XZ"'XZ"yiY" , are networks, then A'+A" and A'^,^ are networks as well. A'-\-A" and A'x^y are defined as follows: A'+A":^{X, Y, Z, U), where X=X'\JX" Y=Y'\JY" Z^Z'XZ" U={{xAz{,2[')A22,22^yMx,2[,z',,y)eU'A2['-^2Wi(^,z[',2'.',y)eU''Az[=2^)} AGGREGATION OF PREFERENCES 65 A' ,.„. = <X,F,Z,C7), where X=X'-{x} Y=Y'-{y} Z=Z' U={{x', z, z', y')\3 finite sequence <a;„, 2<„ z'„, y^), . . ., (x„, 2„, z'n, ?/„), such that Xq X Zo = Z Z'n = z' \/i<Cn:yi=Xi+u z'i=2i+i i<n:{xi, Zi, z'u yi)^U'] If a network is represented by a diagram, the sequence of steps of construction according to the inductive definition is no longer visible. However, this is not essential, since networks which are distinguished only in the sequence of steps of construction, but which result in the same diagram are equivalent (for proof, see Priese 1976 [14]). Of course, the complete line of reasoning can be formulated in pure set theory. But such would only complicate matters unduly without any benefits. 2. Some Networks of Automata Realizing Preference Structures Given an exact notion of what is meant by an aggregate or a social system, a theorem similar to the Impossibility Theorem (Arrow 1951 [1]) is sought. This leads to the question of whether a network built up of automata each realizing a preference structure will always realize a system's preference. The answer is no, as intuition would also show. PROPOSITION 2: Let iV be a networkof components 5 ^ {\<i<n) and A^' be a network of components B'i. Assume that Pi are preference structures such that {Bi, B't, Pi) are admissible triples. A^ and A^' are not distinguishable with respect to their construction. Consequently, N'c^N holds as well as B'iCBi. Then there does not always exist a preference structure P such that (A^, A''', P) is admissible. As a proof, an automaton B is given (see Figure 4) with the set of imputs X= {0, 1} set of outputs r={2, 3, 4} set of states Z={b2, 63, b^, 65} and possible transitions (1) <0, 64, 62, 2) (3) (1, b„ h, 2) (2) (0, 65, 63, 3) (4) <1, h, 63. 3) (5) <1, 62, 6„ 4) 1- 0- (6) (1, 63, 62,2) (7) (1, 63, 63, 3> (8) (1, 63, 65, 4). B; B" -►2 -►3 -►4 Figure 4 66 W. RODDING & H. H. NACHTKAMP Obviously, there are gaps in this table of behavior. Nothing is said about what B does on receiving an input via channel when its state happens to be 62 or 63, or when the standard signal entering B via channel 1 finds B in state 64 or 65. The example dealt with is that of an incomplete automaton. Now, by identifying output channel 4 with input channel 0, automaton B becomes the net- work Bq, 4 (see Figure 5) with Zo,4 = X-{0} = {l} ro,4=F-{4} = {2, 3} ^0,4^ B'0,4 Figure 5 Note that whenever the standard signal is disseminated by B via channel 4, it returns to B via channel 0, thus staying within the network, and then leaves the network via channel 2 or 3 depend- ing on the resultant state of B. Consequently, the possible transitions of Bo_ 4 are (1, 62, &2, 2) resulting from the transition sequences of B: (1, 62, 62, 2)or(l,62, 64, 4), (0, 64, 62, 2) (1, 62, fta, 3) <1, h, h, 2) (1, 63, 63, 3) resulting from the transition sequences of B: (1, 63, 63, 3) or (1, 63, h„ 4), (0, 65, 63, 3). It is not hard to see that the set of transitions open to Bq. 4 is identical to that open to the automaton A of the above quoted example, and that A and Bo, « have the same sets of inputs, outputs, and states: 5o, 4=.A. The next step is to consider automaton B\ which results from B realizing the incomplete preference P: P: 64 is better than 62, 6s is better than 63. Based on B' , the network 5o.4 is formed in the same way as was So. 4 with B. Inspection of the behavioral possibilities of B'o, 4 with the help of the behavioral table of Bo, 4 indicates that lines two and three are to be eliminated, since automaton B, if realizing P, must always move from 62 to 64 or from 63 to 65; therefore B'o,i, seen from the outside, always realizes {1, 62, 62, 2) resulting from the transition sequence of B' : <1, 62, hi, 4), (0, 64, 62, 2) AGGREGATION OF PREFERENCES 67 or (1, 63, 63, 3) resulting from the transition sequence of B' : (1, 63, 65, 4>, (0, 6s, 63, 3). Thus the network J?o. 4 has but two modes of behavior, for when it receives the input signal it sticks to its initial state and informs the outside world via the corresponding channel. A comparison with subautomaton A' of Proposition 1 shows B'o,i=A' . From this it also follows that B'o.^=A'czA=B^,,. and thus ■Bo, 4^-^0,4 • According to the proof of Proposition 1, there exists no P to make {A, A', P) admissible, nor is there any admissible (5o. 4, B'o, ^ , P). In other words, the network 5o. 4 does not realize a preference structure, though its only component, automaton B, does. Proof of Proposition 2 leads to an analysis concerning the theory of revealed preferences. Imagine that a researcher interested in the behavior of a certain agent, individual or collective, tries to investigate him by questioning, observing, or some similar procedure. The agent may be represented by an automaton C very similar to automaton B since it has identical input and out- put sets as well as an identical set of states (here denoted by C2, C3, c^, C5). The behavioral pattern of C is the same as that of B except for those transitions beginning with the use of input channel 0. If the standard signal enters C by channel 0, the following changes in the rules concerning B hold : (0, C4, C3, 3), (0, Cj, d, 2). If input channel 1 is used, the same transitions are induced as with B. Let a feedback "channel 4 to channel 0" be a part of the model representing the agent to be investigated as in the case of B. For the external observer unaware of the agent's internal structure, the model is given by a network Co. 4 with an automaton C. Co, 4 has the following behavioral pattern : (1, C2, C2, 2) (1, C2, C3, 3) resulting from the transition sequences of C: (1, C2, C3, 3> or (1, d, C4, 4), (0, C4, C3, 3) (1> C3, C2, 2) resulting from the transition sequences of C: <1, C3, Ci, 2) or (1, C3, Cs, 4), (0, Cs, d, 2) (1, Ca, C3, 3>. Let the linear preference structure c^PCiPczPci be imposed on C; this means that Cj is the most and C2 the less valued state. Let the use of channel 1 represent the question: "Is there any state which you prefer to your present one?" If the experiment terminates with the initial state, this is to be regarded as a negative answer; it if ends with another state, the answer is positive. When the net- work Co. 4 is in the initial state C2 and is entered by a signal, it will always end up in state C3, regard- less of whether automaton C immediately turns to state Cz or first to state C4. On the contrary, if Co, 4 is initially in state C3, the experiment always terminates with state d. For C turns to state Cs 68 W. RODDING & H. H. NACHTKAMP due to the preference ordering; the signal leaves C by channel 4 and is redirected to C by channel 0. So C now must turn to state c^ and give the signal to the outside world via channel 2. If the experimenter performs an uninterrupted sequence of experiments with the network Co. 4 without manipulating it, he will receive alternating answers to his question. Without paying attention to the initially existing state, as is generally customary in current preference theory, the interviewer could conclude from the behavior of the network (i.e., the agent being investigated), that • either Chas a circular preference {c^PciPc-^, or. • C is indifferent as to Cz or C3, or • C cannot compare d and C3. Obviously all three conclusions are incorrect. If, however, the interviewer tries to take the initial state of Co, 4 into account when he poses his question, then he soon will see that the agent constantly turns to the other state. From that he might deduce that either • the agent does not know what he wants, or • his preference ordering depends on the precise situation in which he happens to be. (One could not deduce in this case that states C2 and Cz are indifferent or incomparable. Otherwise in the course of a sufficiently large number of experiments, the one or the other would have to end with the initial state and corresponding output.) Both conclusions fail to deal with the actual circumstances. Consequently, one has to note that reliable explanations of the maxims of agents' actions can- not be grasped from their behavior. One could almost call this realization trivial, were it not for the theory of revealed preferences. Indeed, we are aware of doubts as to the usefulness of this theory (Morgenstem 1972 [12], p. 1167, and 1934 [9], pp. 436, 448, 456), but up to now we have not en- countered the assertion that a preference ordering may fail to be discovered by observation due to conditions on the internal structure of communication, or that, owing to these conditions, an agent's maxims of action can appear to be completely different from what they really are. One could make the objection to this approach that its conclusions possibly follow from the fact that the network Co. 4 realizes no preference structure. It would not be difficult to meet this criticism: replace the transition (0, C4, C3, 3) open to C by (0, C4, C2, 2) and the transition (1, C2, Ca, 3) by (1, C2, Ca, 4). Then every experiment with Co. 4 ends with state d and output 2. Co. 4 thus behaves as if it were governed by the preference ordering CjPca. Consequently, the experimenter would gain a completely misleading impression of C's maxims of action. This investigation of problems concerning the theory of revealed preferences has stressed the difficulties of gaining a nomological hypothesis about agents' behavioral patterns from the obser- vation of their actual behavior alone. It will also be necessary to experiment with networks of automata realizing preference structures — possibly with the aid of a computer — and to study their behavior. Propositions of empirical content concerning individual and group behavior may possibly be derived at a later date on the basis of experience gained in this way (Rodding 1973 [16]). It may even be possible to find networks which can be used for purposes of forecasting before their behavior is explained. Thus we are taking a path familiar to natural scientists in the formation of their theories, the path of experimental science. Whether the desired results can be derived by this pro- cedure is, admittedly, still an open question. Theoretically we have a long way to go. Nonetheless, we hope that it will be possible to speak unmistakeably on complicated problems by means of net- work theory — to speak moreover in a language also familiar to a computer. AGGREGATION OF PREFERENCES 69 Having decided to take this approach, one is confronted with the problem of how to start. Up to now, we have really only given a precise formulation to intuition and proved what could be intuitively anticipated. If, however, the problem is to be delved into more deeply, two lines of further questioning present themselves. Firstly, how do preferences arise? In particular, (a) can the behavior of an automaton with preferences be simulated by the behavior of a net- work of components, each of which belongs to a certain well-defined stockpile, and realizes at any given moment only "elementary preferences?" (b) can one analyze these elementary preferences combinatorially in the following sense: does a stockpile of automata wihout preferences exist, the aggregation of which results in a network with an elementary preference? Secondly, how do aggregative preferences arise? In particular, (a) can some or even all classes of automata with preferences and/or classes of connections between them be specified such that the aggregation of automata always generates a network with preference? (b) are there processes of construction which allow for correctly calculating the preferences of a network, given the preferences and the connections of the automata (see especially Blin 1973 [2])? We cannot at the present stage of research provide exhaustive answers to all four questions. But we are able to contribute to the answers of questions la and 2a, as discussed in the following section. V. THE AGGREGATION OF PREFERENCES The discussion of the theorem developed below requires an explanation of a particular concept of simulation. DEFINITION 4 : B simulates A if, for every initial state of A, there exists an initial state of B such that the resulting sets of protocols are identical; i.e. B simulates A if Xa'^Xb; Y^^Yb] ^{z^Za)2{z'^Zb) (Prot (2) with ^=Prot (2') with B). THEOREM : For every automaton A and every subautomaton A'cA there exist an automaton B, a subautomaton B'cB, and a preference structure P with the following property: B realizing P generates B' such that B simulates A, and B' simulates A'. This theorem does not contradict Proposition 1. The latter states that there does not exist a respective preference structure P to make (^4, A', P) admissible for all pairs of automata {A, A'), A'cA. By the theorem, one can find for any pair {A, A'), A'czA, an admissible triple {B, B', P). This is a fact regardless of whether A' rises from A by virtue of a preference or otherwise. B and B' will generally be richer in behavioral possibilities than A and A'. The point is that B {B') is able to conduct itself with respect to its inputs and outputs, but not necessarily with respect to its changing of states, like A {A'). PROOF OF THE THEOREM: It will suffice to prove the following two assertions: (a) There exist a basis of elementary automata (called blocks) with which certain invariant preferences are associated and a principle K of constructing networks from these blocks such that every network which is constructed in accordance with K and which may be conceived as a pair of automata with respect to the preferences imposed on its blocks realizes a preference structure. 70 W. RODDING & H. H. NACHTKAMP (b) For any pair of automata {A, A'), A' a A, there exists an admissible triple (N, N', P), with A'^ constructed in accordance with K and P resulting from the blocks of A'^ realizing their preference structures, such that N simulates A and A^' simulates A'. In contrast with Ottmann [13], who proves a similar proposition of simulation, we provide a version of the proof which is particularly suited to our concern with the possibility of behavioristic simulation as opposed to the question of isomorphisms. However, first the blocks forming the basis of network construction, the construction principle K, and a precise definition of what is meant by a network's preference structure must be provided. DEFINITION 5: Let the set of blocks K, H, I, and /' be the basis (see Table 3). Table 3 block input set output set set of states transitions open to block K: H: 11,21 1, 4. 61 (2, 3, 5, 71 \K\ (top, bottom] I: (01 11,21 {i\ /': ^ i;i' 11, 21 (top, bottom] (1, k, k, 3) (2, k, k, 3) (l, top, top, 2) (l, bottom, bottom, 3) (4, top, top, 5) (4, bottom, top, 5) (6, top, bottom, 7) (6, bottom, bottom, 7) (0, i, I, 1) (0, i, i, 2> (O, top, top, l) (O, top, bottom, 2) (O, bottom, top, 1) (O, bottom, bottom, 2) For block H the set of channels 1, 2, 3 is called the "testing mechanism", the set of channels 4, 5 the "upper adjusting mechanism", and the set of channels 6, 7 the "lower adjusting mechanism". Block H could be substituted by a block even more "elementary" [13]. With K and H determinate and / and /' indeterminate automata, /' is the only indeterminate automaton among the four blocks with more than one, namely two, states. Consequently, the behavior only of a block /' can be affected by a preference ordering. It may be assumed without any loss of generality that the state "top" is favored by any /'. AGGREGATION OF PREFERENCES 71 DEFINITION 6 (CONSTRUCTION PRINCIPLE K) : For every 7' belonging to a network, the following holds: among the exits potentially influencing the entrance into an /', there is no exit from an I or an /'. DEFINITION 7 (The preference structure of a network built up in accordance with K) : The states of a network are identified with vectors, the elements of which denote the states of the net- work's blocks. Two vectors are said to be indifferent if they differ only in components related to a block H. (A preference ordering of the states of H makes no sense, since H is determinate). A vector is called better than another if it is better in at least one of the remaining components and worse in none. PROOF OF THEOREM, PART a: It is to be shown that a network constructed according to K will improve its state if possible and will leave its state unchanged if an improvement is impossible. For this proof, the path of a signal x along the course of a (ceasing) procedure within the network is to be considered (Figure 6). ^ H\- By K, a signal x passes through at most one block /' and can there-by at most improve the state of the network; if the signal passes through an /', its path to there is uniquely determined, and the signal leads to an improvement of the network's state if this was possible in this block. Otherwise, the network's state remains unaltered. DEFINITION 8 (The generalization H^ of iiT) : H is regarded as an automaton with two states, called, for example, "top" and "bottom", as well as an upper and a lower adjusting mecha- nism, each consisting of an entrance and an exit, and a testing mechanism consisting of one en- trance and one upper and one lower exit. Then any generalization H^ of H represents an automaton also having two states and with a finite number of upper and lower adjusting mechanisms both independent of one another. Its testing mechanisms, finite in number as well, "switch over" simul- taneously with a change of state. It will be proved in the appendix that W can be simulated by networks of H and K. 72 W. RODDING & H. H. NACHTKAMP PROOF OF THEOREM, PART b: Assume any A'^A. Consider a transition (x, z, z' , y) belonging both to A and to A' , and another transition (x, z, z" , y') belonging to A, but not to A' . Let state z be represented by a certain block H'^(z) related to this state and in state "top". All other blocks H^ happen to be in state "bottom". Consider the path of a standard signal entering the network by channel x. The signal first passes through some of the blocks H'^ via those testing mecha- nisms which each contain an entrance x. The lower exits of these testing mechanisms are connected to the entrance x of the block H^ which is the next to be passed. Possibly, the signal reaches H^iz) and leaves it via the upper exit of the relevant testing mechanism. Then, by running through one of the lower adjusting mechanisms, the signal puts H.^{z) into its state "bottom". Obviously, the exit thus used is uniquely related to the pair {x, z) . For this pair, as for any other pair (x, z) , there exists exactly one block /' in the network, which is now entered by the signal. Behind the two exits of /' are found "fans" made from blocks /, i.e. automata with only one state, which indeterminately dispatch a signal entering via the single entrance through one of their exits. In particular, the "fan" behind the upper exit of /' has an exit related to the transition {x, z, z' , y), and the "fan" connected to the lower exit of /' has an exit related to the transition (x, z, z", y'). After entering /' the signal can take either exit if /' does not reahze its preference, while the signal is restricted to the upper exit if /' realizes its preference. As the further course is analogous in both cases, we can limit ourselves to (z, z, z' , y). Now the signal puts block H'^{z') in its state "top" by passing through one of its upper adjusting mechanisms. After running through several blocks K the signal leaves the network via exit y. The proof shows, incidentally, that since preferences affect only blocks /', the theorem is com- patible with several concepts of realizing a preference structure, and not just with the particularly weak one dealt with here. VI. A REVIEW AND A PREVIEW 1. Some Concluding Remarks We have devoted this combinatorial investigation to the concept of preferences because, in our opinion, • the concept of preferences is fundamental to any theory that aims at explaining the acting of persons or groups of persons, • the recognition of the combinatorial character of economic problems [10] is a prerequisite for building an exact economic science [8]. The complex nature of subjects dealt with in the social sciences does not excuse the use of inexact concepts. We have decided to employ a certain concept of logical network of finite automata. We have been guided by the intuitive idea that elementary processes are realized in the network's elementary automata and that the aggregation of these elementary processes within the network induces processes of higher and higher complexity according to the stage of aggregation. As far as we can see, the complexity of a theory developed on this basis is no argument against its adequacy ; a major reason for its complexity is the use of indeterminate automata, which is completely indis- pensable in the field of social sciences, particularly when one faces an analysis of preferences. In our view, an extraordinary discrepancy exists between the complexity of subjects in social sciences and the methods currently employed to give precise notions of these subjects. Shubik [17], for AGGREGATION OF PREFERENCES 73 example, demands a microeconomic theory more abstract than the current one and, at the same time, capable of includmg many more factual details. We hope to have provided through our discussion of the very flexible automata concept a starting point for the development of such a theory. 2. Some Remarks Concerning Further Research The definition of an automaton given in section II, part 1 was dropped beginning in section II, part 2 as more technical results were presented which restricted the possibilities of modelling. In this connection, two possible extensions ought to be pointed out; namely, the meaning-preserving transfer of the definition of realizing a preference structure to the more general concept of (a, /3, 7) -transitions, and the idea of connecting transitions rather than states with structures of preferences. As to the first point, it should be noted that one can achieve the simulation of (a, j8, 7) -automata by automata of the later expounded restricted type by a technical device. This is done by providing each one of the latter type with an additional input channel and an additional output channel. In this manner, simulating sequences of a-transitions, /3-transitions, or 7-transitions of an auto- maton of the more general type turns out to be possible. An a-transition is simulated by an auto- maton of the restricted type delivering the signal via the additional output channel, a 7-transition by receiving the signal via the additional input channel, and a /3-transition by using both of the additional channels. This procedure is indeed artificial in view of the socioeconomic phenomena to be depicted here, but it shows that the use of (X, Y, Z, ?7)-automata instead of {X, Y, Z, a, 0, 7)- automata is not essential. Network theory of the kind employed here naturally permits plausible alternatives to the notion of realizing a preference structure presented in this article and related to a certain kind of satisficing behavior. By this notion, a course of action leading in some way to an optimum state is taken only indirectly. A further maxim could be formulated as follows: "If an automaton makes a transition, there is no transition open to it such that the resultant state would be better than that resulting from the actual transition". It is not yet clear whether the propositions proved here are valid for a concept of realization oriented to this optimizing maxim. To be sure, and that should be the essence of this consideration, network theory is not restricted to a certain class of realization concepts. The given definition of an automaton realizing a preference structure does not take into account the fact that strong arguments can be put forward for a preference ordering dependent on the actual state of the automaton. According to the concept of admissible triples presented here, an automaton A fixes the total of possible preference structures with which it can be combined so that an admissible triple {A, B, P) results. As a matter of duality, a preference structure determines the total of automata A which can realize P by virtue of an admissible triple {A, B, P) . Up to now we have had little knowledge of the two problems of representatives which emerge here. Certainly one must ask how far these questions posed are correlated to the substance of ideas which led to our considerations. Obviously, one should not stop with an analysis of preferences. Rather, the task presents itself to analyze the formation of further concepts in the economic and sociological fields (as has been done, for example, for the concept of power with the formal aids used here by Rodding [16] 74 W. RODDING & H. H. NACHTKAMP with the hope of arriving at theorems on a basis defined this way. The present article should be regarded only as a first step in this direction. VII. ACKNOWLEDGMENT The authors wish to express their gratitude to Dr. Lutz Priese, University of Dortmund, for contributing the examples for Propositions 1 and 2. APPENDIX SIMULATION OF BLOCK H+ BY NETWORKS OF H AND K DEFINITION Al (generalizations of H) : Hi_ „, „ {I, m, n being finite positive integers) denotes that generalization of H which has I testing mechanisms, m upper adjusting mechanisms, and n lower adjusting mechanisms. PROPOSITION Al : Any Ht. m. « can be simulated according to Definition 4 by a network of blocks H and K {H=Hl 1. 1 ). PROOF: Given by six steps and a corollary. STEP 1 (simulation of Ht. m.n by use of two blocks Ht. m.n w>l, n>\: Let the exit of each upper (lower) adjusting mechanism of one Ht.mn. be connected to the entrance into the corresponding adjusting mechanism of the other block H^^^.n- Consequently, both automata are simultaneously switched whenever the network which is generated in this way is used by the entrances which belong to an adjusting mechanism and are still free; in addition, there are two available testing mechanisms which are independent of one another and capable of being switched over simultaneously. Thus, the network behaves like Ht. m.n (Figure Al and A2). STEP 2 (simulation of Hi^, m,2, ^>1, m>l, by means of one block H, one block Z^/; „, ,, and one K): Let the exits of the adjusting mechanisms of H be connected via the block K to the en- trance of the lower adjusting mechanism of Ht,m.\- Inspection of the graphical representa- tion will easily show that the network does indeed simulate Ht^ „, 2 (Figures A3 and A4). Figure Al AGGREGATION OF PREFERENCES 9 10 75 Figure A4 STEP 3 (simulation of Ht,2.n. i>i, n>l, by a network of one block Ili.\,n and one K)\ The proof can be provided analogously to Step 2. For the next three steps, assume that simulation of Hf^^.n (^>1, "^^1, w>l) by networks of blocks H and K has been shown. Steps 4 to 6 prove by mathematical induction that the procedures employed in Steps 1 to 3 can be generalized. 76 W. RODDING & H. H. NACHTKAMP STEP 4 (generalization of Step 1: simulation of Ht+i,m.n. by use of one Ht.n.n and one Hi*;m.„) : The graphical representation suffices to show the correctness of the assertion. • Definition of Hj+i,„,„: Diagram (Figure A5) i + 1 ) + 1 2 ( L + 1 ) + 1 2 (1 H ) -1 3 ( 1 + 1 ) - 1 2(1+1) 3(1+1) Figure A5 Table of transitions (WX, top, top, Wi+i+x) (wx, bottom, bottom, M2(i+i)+x) {vis., top, top, IJ„+p) {v\i, bottom, top, v^+m) (w„ top, bottom, w„+^) {Wy, bottom, bottom, Wn+,) 1<X<Z+1 1<X<Z+1 l</i<m \<\s.<m \<v<n \<v<n • Simulation of Hi^i_„_n by means of Ht^m.n and H^^^.n (Figure A6) : STEP 5 (generalization of Step 2: simulation of Ht.m.n+i by one Hl^^.n, one H, and one K; Figure A7 and A8) : STEP 6 (generalization of Step 3: simulation of ///;„+!.„) : Analogous to Step 5. Corollary: Any given automaton Hi^^.n with arbitrarily chosen positive finite integers I, m, and n can be simulated by a network of automata H and K. AGGREGATION OF PREFERENCES 77 + + 3 ii ^^ e + :n e > ■> + c •^ >••■) ■e-c ^ E . - .— + —1 a: i 1 + — + 1 1 ^ + + + i> M <> a >^ > ^ ;- (--J — ^ > + ^H 3 3' ,!•••• ....< k to « U O o o 78 W. RODDING & H. H. NACHTKAMP l+l n n ' Figure A7 Figure A8 BIBLIOGRAPHY [1] Arrow, K., Social Choice and Individual Values (Wiley, 1951). [2] Blin, J. M. , Patterns and Configurations in Economic Science. A Study oj Social Decision Processes, Reidel (1973). [3] Chipman, J. S., "The Foundations of Utility," Econometrica S8, p. 193-224 (1960). [4] Churchman, C. W., Prediction and Optimal Decision, Philosophical Issues of a Science of Values, p. 231 (Prentice-Hall, 1961). AGGREGATION OF PREFERENCES 79 [5] Green, H. A. J., Aggregation in Economic Analysis. An Introductory Survey, Princeton, Uni- versity Press, p. VII (1964). [6] von Kempski, J. "Handlung, Maxime und Situation. Zur logischen Analyse der mathe- matischen Wirtschaftstheorie," in Theorie und Realitdt, ed., by H. Albert, p. 139-152 (Mohr, 1972). [7] Koch, H., "Die zeitliche Modellstruktur einer handlungsanalytisch konzipierten Theorie der Unternehmung dargestellt anhand der Theorie des Absatzes," in Zur Theorie des Absatzes, Festschrift zum 75. Geburtstag von E. Gutenberg, ed. by H. Koch, p. 215-262 (Gabler 1973). [8] Menger, K., "Bemerkungen zu den Ertragsgesetzen," Zeitschrift fiir Nationalokonomie 7, 25-46 (1936). [9] Morgenstern, O , "Das Zeitmoment in der Wertlehre," Zeitschrift fiir Nationalokonomie 6, 433-458 (1934). [10] Morgenstern, O., "Demand Theory Reconsidered," Quarterly Journal of Economics 6 (1948). Reprinted in The Evolution of Modern Demand Theory: A Collection of Essays, R. B. Ekelund, Jr., et al. (Eds.) (Heath Lexington Books, 1972). [11] Morgenstern, O., "John von Neumann, 1903-1957," Economic Journal 68, 170-174 (1958). [12] Morgenstern, O., "Thirteen Critical Points in Contemporary Economic Theory : An Interpre- tation," Journal of Economic Literature 10, 1163-1189 (1972). [13] Ottmann, T., "IJber Moglichkeiten zur Simulation endlicher Automaten durch eine Art sequentieller Netzwerke aus einfachen Bausteinen," Zeitschrift fiir mathematische Logik und Grundlagenforschung, 223-238, (1974). [14] Priese, L. "Reversible Automaten und einfache universelle 2-dimensionale Thue-Systeme," Zeitschrift fiir Mathematische Logik und Grundlagen der Mathematik, (1976). [15] Priese, L., and D. Rodding, "A Combinatorial Approach to Self-Correction," Journal of Cybernetics, 4 (3) 7-25 (1974). [16] Rodding, W. Macht: Prazisierung und Messbarkeit, in Macht und Okonomisches Gesetz, Verhandlungsband der Jubildumstagung der Gesellschaft fiir Wirtschafts- und Sozialwis- senschaften-Verein fiir Socialpolitik , ed. by H. K. Schneider und Chr. Watrin, Duncker & Humblot, pp. 457-472 (1973). [17] Shubik, M., "A Curmudgeon's Guide to Microeconomics," Journal of Economic Literature 8, p. 409 (1970). NON STATIONARY STOCHASTIC GOLD-MINING: A TIME-SEQUENTIAL TACTICAL-ALLOCATION PROBLEM Gaineford J. Hall, Jr. The University of Texas Austin, Texas ABSTRACT This paper presents an extension of gold-mining problems formulated in earlier work by R. Bellman and J. Kadane. Bellman assumes there are two gold mines labeled A and B, respectively, each with a known initial amount of gold. There is one delicate gold-mining machine which can be used to excavate one mine per day. Associated with mine A is a known constant return rate and a known constant prob- ability of breakdown. There is also a return rate and probability of breakdown for mine B. Bellman solves the problem of finding a sequential decision procedure to maximize the expected amount of gold obtained before breakdown of the machine. Kadane extends the problem by assuming that there are several mines and that there are sequences of constants such that the ^th constant for each mine represents the return rate for the jth excavation of that mine. He also assumes that the probability of breakdown during the jth excavation of a mine depends on j. We extend these results by assuming that the return rates are random variables with known joint distribution and by allowing the probability of breakdown to be a function of pre- vious observations on the return rates. We show that under certain regularity con- ditions on the joint distributions of the random variables, the optimal policy is: at each stage always select a mine which has maximal conditional expected return per unit risk. This gold-mining problem is also a formulation of the problem of time- sequential tactical allocation of bombers to targets. Several examples illustrating these results are presented. 1. INTRODUCTION AND SUMMARY Richard Bellman, in his now classic book Dynamic Programming [1], introduces in Chapter II a stochastic multistage decision process in the guise of a stochastic gold-mining problem. It is assumed that there are two gold mines A and B, which initially contain amounts of gold x>0 and 2/>0, respectively. There is a single gold-mining machuie with the property that if it is used to mine gold in A, there is a probability ^i>0 that it will mine a fraction ri of the gold there and re- main in working order, and a probability 1-pi that it will mine no gold and be damaged beyond repair. Similarly, B has associated with it the corresponding probabilities p2 and l-p2, and fraction r2. On the first day of mining, the miner must decide to work either A or B. If the machine does not break down, the miner must again make a choice between A and B on the second day. The process continues until the machine breaks down. Bellman solves the problem of finding an optimal Q1 82 G. J. HALL, JR. sequence of mines to maximize the expected amount of gold withdrawn before the machine breaks down. At the chapter's end, Bellman asserts (but does not prove) that his methods of proof can be used to solve the problem extended to the case of N mines and that each constant fraction r, can be replaced by a random variable, giving (essentially) for each i a sequence Vn, ri2, . . .of inde- pendent and indentically distributed (i.i.d.) random variables, where r^y is the fraction of gold of mine i obtained on the jih excavation of i (given the machine is still working) . However, the extension of Bellman's methods to this much more interesting (and much harder) problem is difficult to see, and the author has not found the solution to this problem anywhere in the literature. Kadane [8] extends the problem solved in Bellman's book by assuming that there are A^^ gold mines and that the gold-mining machine functions with probability ptj if assigned to the ith mine for the jth. time. If it functions, it processes a (nonrandom) fraction Vij of the amount of gold that remains in mine i after j-1 excavations of i. If the machine does not function, it cannot be repaired. Assuming Zt is the amount of gold originally in mine i, Kadane solves the problem of how to sequentially choose mines to work, and to maximize the expected amount of gold, under some regularity conditions. The regularity conditions are imposed to prevent making bad choices today, thus permitting lucrative ones to become available in the future. They are: (i) the amount of gold processed does not increase with successive excavations, i.e., r<y>(l—r<y)r <._,+! for all j>l, and l<i<N. (ii) the probability of success does not increase with successive excavations, i.e., pi) ^Pi.}+i for all j>l, and l<i<N. This problem solved by Kadane does not provide a solution to the more general Bellman problem, because in Kadane's paper rn, r<2, ... is a fixed sequence of known constants, not random variables. Kadane shows that the optimal policy in his problem is the one which at each stage selects the mine which has maximum current expected return per "unit risk". That is, if at some stage n each mine i has been excavated i(i; n) times and l-pt, j(i.„)+i is regarded as the risk for selecting i next, the decisionmaker should choose that i which maximizes (1) ^iPi,ni:n)+l^i.}U.n) + l/{^~Pi.j<.i:n)+l) where z'i = Zi II [1—ru] 1=1 is the current amount of gold in mine i, pi_ :/<<;„)+! is the probability that the machine will function during the next excavation of i, and r^, j^i. „)+! is the current rate of return. Thus z'iPi, ja. „)+iri,y(, •.„)+! is the current expected return for the next excavation of i. The policy which at each stage n selects that i which maximizes (1) we call locally optimal. When Vij, pij do not depend on j, and N=2, this of course yields the solution to Bellman's original problem. If the regularity conditions of Kadane are not met, the locally optimal rule may fail to be optimal; i.e., it may not maximize the overall expected amount of gold to be excavated. In this paper we extend the Bellman-Kadane problem by allowing {rtj}]"'! to become a sequence {Pij}?-! of random variables {not necessarily independent identically distributed) and by allowing Pij to become a function ^ij{-) of the past history of excavations in mine i. This is a much more TIME-SEQUENTIAL TACTICAL-ALLOCATION 83 useful model for the problem, since in general the fraction of gold to be obtained during the^th ex- cavation of i is viewed as random (before the excavation )and its value Vtj is only observed after the actual excavation. This is a significant generalization over the Bellman-Kadane model, because as illustrated in section 3, there are many important examples of gold-mining problems in which the rates of return are actually random. Under some regularity conditions, analogous to those of Kadane, we show that the locally optimal rule is optimal in this extended gold-mining problem. The locally optimal rule in our problem is obtained from expression (1) by replacing r-j. _,(,. nj+i by the conditional expected value of p,. ,(,. „)+i, given the past observations for mine i, and by replacing The value rtj of the random variable p,-, is a measure of the amount of "effort" put into the jih excavation of i. The more effort put in, the larger r,v, should be, "on the average". However, by allowing pa to become a function ^ijirn, . . ., /"<. y_i) of the past observations, we are able to intro- duce into the model the realistic assumption that for larger value of Vn, the smaller I3ij(-) is, i.e., putting more effort into the excavation may cause the machine to wear out and breakdown faster We would like to point out here a remark related to us by one of the referees. The gold-mining problem was actually a "sanitized" version of what the Rand Corporation and Bellman were really interested in: the optimal time-sequential allocation of bombers to targets. If there are A^ targets to be bombed, where target i is worth initially s i>0 (2 ; is a measure of the military value of target i) , then pij is the fraction by which the value of the ith target is reduced during the jth bombing run of i, and Puipn, . . ., Puj-i) is the probability that the ^th bombing run over i will be successfully completed. Realistically, the larger the values of pn (i.e., the more "effort" that is put in), the smaller is the probability that the next bomb run will be successful. In this problem also, it is much more realistic to model the problem with random rates of return ptj. Thus, imder the regularity conditions, the optimal rule in this problem will be to choose at each stage that target which has maximal conditional expected "return" per unit "risk". In the next section we give a more thorough description of the gold-mining problem with random rates of return treated in this paper and state in Theorem 1 the conditions under which tho Bellman-Kadane locally optimal policy is optimal for the problem. In section 3 we present a series of examples illustrating the importance and usefulness of our results in modeling this type of operations research problem. Finally, we present both the formal dynamic programming structure of the problem and the proof of Theorem 1 in the appendix. 2. THE TIME-SEQUENTIAL STOCHASTIC GOLD-MINING MODEL Assume there are A^ gold mines, each mine i having an initial amount of gold 2j=2j>0, l<i<N. Associated with each mine i is a sequence {Gij}^^! of stochastic kernels; i.e., Gij (•|ri, . . ., rj_i) is a transition probability of [0, 1]^~^ into [0, 1]. Let {po}"=i be a sequence of random variables whose joint distribution is given by i) Gij. By this it is meant that given pn = ri, . . ., p,, ^_i=ry_i, the conditional distribution of p^ is Gij(-\ri, . . ., ry_i). This determines the conditional distribution of the rate of return pa for the 84 G. J. HALL, JR. jth excavation of i, given the past observations in i. The A'' stochastic processes {pij}"_i, . . ,, {pnj}T-i ^re mutually independent. Also associated with each mine i is a sequence {0ij} JLi of functions, where /3i^:[0, l]^~'-^[0, 0\ for some constant ;8<1, and all j>l, l<i<N. Each 0^ is constant. The interpretation is that, given pn=ri, . . ., pi. ^_i=r^_i, ffairi, . . ., r^_i) is the probability that the machine will not break down on the jth. excavation of mine i. The structure of our gold-mining problem is the following: If on the first day the miner selects mine i, he knows that the machine will break down with probability 1-/3 a and remain func- tioning with probability 0n. If the machine remains in working order, the fraction of gold the miner can extract from the mine is r, where r is chosen according to the distribution 6^,1 at the start of the day's work. Thus r is the observed value of pn. If the miner has already worked for n-l days and the machine is still in working order, let j(l) be the number of times he has worked mine /, so that 0<j(r) <n—l for each I and 1=1 Let rij be the value of pi^ observed on the jth excavation of mine I, 1 <j<j{l). If mine i is selected for the Tith day of work, ;3i,K<)+i(^<.j(o) is the probability that the machine will remain in working order during that day, where r<.^(j)=(rii, . . ., ^i. ;(i))- If the machine functions, the rate of return r (i.e., the fraction of gold removed) is chosen from the distribution 6'j,^(i)+i(-|r<.y(j)_^i) (r is the observed value of pi.^(<)+i) at the beginning of the day's work. If the machine still functions, the miner may make a selection on the (n-f l)st day, and continue until the machine breaks down or until all the gold is removed from the A^^ mines, which can happen in our formulation of the problem, but not in the Bellman-Kadane models. The decisionmaker's problem is to find a sequential mining procedure to maximize his expected profits. In the Belman-Kadane models, since the rates of return were known in advance to the miner, the optimal policy was simply a particular fixed sequence ?'i, ^2, . . ., i^, . • • of mines to be excavated sequentially, where i„ is the mine to be worked on the nth day. In our model the rate of return for the jth. excavation of mine i is not observed imtil the actual excavation is performed, so that at each stage n the miner must make his decision based on past observations. Different sets of past observations may lead to different decisions at stage n. Hence, the optimal policy will not be a fixed sequence of mines. As in Kadane [8], we must impose certain regularity conditions to prevent making bad choices today, thus permitting lucrative ones to become available in the future. The basic regularity condi- tion is 0ij(pi.i-i) ( n^ [i-p,„])^(p,,ip,^_i) (C) For each i, i?<^(p,.^_i) = 1— ^ij(P<.7-l) is monotone nonincreasing in j, almost surely. In (C), pi,^_i = (pii, . . ., p,.;_,) is the sequence of past observations in i, and E{pij\pi_j_i) is the conditional expectation of pij given the past observations. Thus z tR{jipi,j-i) is expression (1) of section 1 with j=j{i; n) + 1 and r,, replaced by p^. Essentially, condition (C) is a form of the law of diminishing rate of returns, an entirely realistic assumption. This states that the expected current amount of return per unit "risk", conditioned on the past, TIME-SEQUENTIAL TACTICAL-ALLOCATION 85 does not increase as the number of excavations in i increases, i.e., the machine in a sense may be wearing out, becoming less and less efficient. A special case of condition (C) is the following : (M) (i) For each i, and each j>l, (u [1 — P<v]j-E'(p<^|pi,;/-i)>f n [1 — Pi.] j£'(pj,^+i|p<^), almost surely. (ii) For each i, and each i>l, 0u(pi.j-i)>0i.i+iipii), almost surely. If condition (M) is satisfied, then so is (C). Moreover, condition (M) for our model is the analogue of the regularity conditions of Kadane, (i) and (ii) of Section 1. Thus (M) (i) states that the amount of gold processed does not increase with successive excavations and (M) (ii) states that the probability of success does not increase with successive excavations. Notice that if ^ij = ^i is constant, for all j, so that there is constant risk for each ^, then (C) is met if and only if (M)(i) is satisfied. Also note that if (M) (i) is met, then {1 — Pi;}^=i satisfies the strong monotonicity condition (or SMC) of Hall [6]. If {poly-i is a sequence of constants satisfy- ing (M), our model yields Kadane's formulation of the problem. Throughout the rest of the paper, for each stage n>l,j{i; n) will denote the number of excava- tions performed in mine i up to stage n. We now formally define the locally optimal rule for our problem. DEFINITION 1 : The locally optimal decision rule is the one which at each state n selects the mine which maximizes ZiRi^j^i.„-, {pum-n))', ie., mine i is chosen if (2) 2<Z?<.^(<.„)(p<,^(<.„))= max ZiRi,;j(i:n)iPl.ni:n)), ~ i<l<N ~ where Rijipij) is defined in (C). We shall also refer to this rule as the Bellman-Kadane rule. This rule is not optimal in all possible mining problems. One can easily construct examples to show this. However, we can prove the following theorem. THEOREM 1: Assume that for each i, {pij}T=i, {/3(j}r = i satisfy condition (C). Then the Bellman-Kadane rule is optimal; i.e., it maximizes the total expected amount of gold to be excavated. The proof of this theorem is given in the appendix. Essentially, the proof shows that if at some stage n mine i is most attractive in the sense that it achieves equality in (2) , and some other mine k is chosen, then if the machine still functions, mine i will be most attractive after k is worked for that day. This may not be true if (C) is not satisfied. An example is provided in the section on ap- plications. However, if condition (C) holds, we see that the "intuitively reasonable" rule of always selecting the action which has maximum conditional expected return per unit risk is optimal, even though the distribution of future rates of return is highly dependent on the past observations. In the appendix we also show how the gold-mining problem may be formulated as a dynamic programming problem (see Bellman [1], Blackwell [2], and Hinderer [7]). Since this is essentially a discounted dynamic programming problem, an optimal rule always exists. However, unless condition (C) is met, it may be very difficult to explicitly compute an optimal rule. 86 G. J. HALL, JR. Since the formulation of the sequential bomber-target allocation problem yields the same dynamic programming problem, it is clear that if (C) is satisfied, the optimal rule is the one which always selects that target which has maximal current conditional expected "return" per unit "risk". Condition (C) is entirely reasonable for this problem. It reflects the fact that conditional expected return per unit risk does not increase with successive bombing runs over a target. Our model is more realistic than that of Bellman or Kadane in this problem, since in our formulation it is pos- sible that a target may be entirely destroyed during an attack (since we could have Pi;= 1 for some j), whereas this possibility is not modeled by the Bellman-Kadane formulation, siace their rates of return are constant. 3. APPLICATIONS AND EXAMPLES (a) The Multiple Sites-Per-Target Problem This example is posed as a tactical-allocation problem. Assume each target i has n<>0 sites or "sub- targets" which must be destroyed in order to destroy target i. For instance, a target may be an area with Zt=ni installations, each of which must be destroyed. We assume that each installation is worth one unit. Let Bi be the probability that any one of the installations is destroyed during a particular bomb run. We assume that Bi is known, O<0j<Cl. Therefore, '\i j-\ bomb runs have been made on targets, and if mj, ^_i>0 sites of target ^ remain undestroyed {rni^j-i<n^, then the random number bij of sites that will be destroyed on the next run has distribution P(5,; = /:|mi,j_i) = ('"';t'"')0i* (1— ei)™-'-'-*, 0<Ar<mi.^_i, i.e., it is binomial. Note that i-i where bu is the number of sites destroyed on the j/th run on target i and that the rate of return is Pij=5jmt,j-i. Since E{dij\mt.j-i) = mi_j-iBi=E{8ij\5i,j-i), it is clear that -ECpi^l pi,^_0>[l — pj Eipi.j+i\pi,j), a.s., since E{pij\pj^j_i) = Bi. Thus (M) (i) of section 2 is met, and if (M) (ii) is met, then the optimal policy is the one which at each stage selects the target with maximum condi- tional expected return per unit risk. Although this is certainly the "intuitively obvious" solution for this particular problem, this problem does not fall under the formulation of either Bellman or Kadane, since the rates of return are random and are not identically distributed. Hence we have provided a rigorous proof (Theorem 1 of section 2) of the "obvious" fact that this rule is optimal. (b) The i.i.d. case Suppose that for each mine i, pn, Pn, . . . are independent and identically distributed with known distribution. Then condition (M)(i) of section 2 is clearly satisfied. If, moreover, {;3<^} satisfies (M)(ii) then condition (M) is met, hence so is (C). Thus, our theorem provides a solution to the extenstion of the gold-mining problem mentioned in Bellman [1]. TIME-SEQUENTIAL TACTICAL-ALLOCATION 87 (c) The parametric adaptive gold-mining problem In this subsection we give two examples of the parametric adaptive problem. Generally, assume that, given 6i, pn, p{2, ■ . ■ are i.i.d. with distribution Gi{dx\di), where 6i is a fixed parame- ter. Now it may be that the actual value of the parameter is unknown, but that the decision- maker has enough prior information about the parameter to develop a prior distribution Ft for d{. Specifically, we assume that di, . . ., 6^ were chosen independently from known prior distribu- tions Fi, . . ., Fn, respectively. This type of OR problem is a parametric adaptive problem. If Ve,Tiz) denotes the expected reward using policy t starting at initial state 2= (21, . . ., Zn) when d={di, . . ., 6^) is the state of nature, EVe, r (2) is the expected reward from tt when 6 has distribution n UF,. 1 (See the appendix for an explanation of the notion.) Since {pi^}7=i a^re i.i.d. given 0<, by computing the conditional posterior distribution of digiven pt,m-i=ipiu ■ ■ ., p<.m-i), we may find £'(pi,„|p<.;„_i) since it equals £'9,|p,„,_, (£■[£<„ |e<]|P;...,), where -Eeiip. „., is the expectation with respect to the distribution of dt given pt, ^-i- Thus, if the marginal joint distribution of pn, p,2, • • ■ satisfies (M)(i), we may apply Theorem 1 to the parametric adaptive problem. We give two examples for this problem wherein the posterior distribution of each dt can be readily computed and (M)(i) is satisfied. The first example is phrased in terms of an optimal target-allocation problem. Assume that for each target i, the nature of the target is such that if the target is hit directly, it is totally de- stroyed. If the target is missed on a bomb run, little or no damage is done to the target (an assump- tion made, for example, in Dantzig et al [4]). This would be the case if, say, the targets were underground installations which were well-protected and which only would cease to be functional in case of a direct hit causing total damage. Let dt be the (unknown) probability of a direct hit. Thus, given 6i, pn, pi2, . . . are i.i.d. Bernoulli random variables with probability mass function J {p\di)=^dif(l—6iy~'', p=0, 1. Hence p=l if and only if the target is destroyed. Dropping the subscript i, we take as initial prior for 6 a beta distribution with density written deBe{a, b). We choose this prior for ease of computation and because the beta distributions with various choices for a, b may be used to approximate many different distributions on [0, 1] ([3], pg. 463). Since 771 — 1 g(e\pi, . .., p,„-i)oc n f{pj\e)g{d), it is easy to see that the posterior distribution of 6 is / 771 — 1 771-1 \ BeU+Z) Pj, 6+m-l-X) Pi)- Thus m— l a+Z) Pj EiPni\Pm-y) a-\-b-\-m—l 88 G. J. HALL, JR. It follows that Eipjp,n-y)>[i- Pm\E{pm+APm) for p„=0 or 1. Hence, if {^a} satisfies (M)(ii) (say ^o = /3i, independent of j) then (C) holds, and Theorem 1 appUes. As a second example, let 7=l-p, and assume that given 0>O, 71, 72, • • • are i.i.d. with density f(y\e)=ey'-U[o.i] (7), which is Be (d, 1). Hence pe Be (1, 6). The beta distribution is chosen for p because it is commonly used as an approximating distribution for probabiUty distributions on [0, 1] and because we merely want to illustrate the computations involved in the theorem. As prior, let where a>0, 6>0. This is denoted as OeQia, b). Hence J^ -e(b+ s log (7.-')] g(d\y,, . . .,yn)o:gid) Uf{yi\e)ocd''+-'e ^ '^' ' ; i.e., the posterior distribution of B is g(^a+n, 6+|: log (7,-^))- As an Example (b) of section 5 of Hall [6], it is not difficult to check that {7j}r=i satisfies (M)(i) if 6>1. It would be reasonable in some applications to take I3jipi, . . ., p,-,)=n /3(p.), where j3(-) is some function such that 0<j8</3*<l and j8* is constant. 4. CONCLUDING REMARKS This paper has formulated and solved a class of stochastic gold-mining problems wherein the rates of return are random variables and the probability of breakdown of the mining machine de- pends on previous observations of the return rates. We have shown that if the model satisfies the law of diminishing rate of returns, the structure of the optimal policy is similar to that obtained by Bellman and Kadane: at each stage, select a mine which has maximal (conditional) expected return per unit risk. The solution was illustrated with several examples concerning the optimal time- sequential allocation of bombers to targets and a more sophisticated mathematical example. The author anticipates that even more useful examples may be found in the unpublished literature on the problem. APPENDIX In this section we shall formally construct the dynamic programming formulation for the problem and give the optimality equation for the reward function. Using this, we will show that under our regularity conditions the locally optimal rule is optimal. Those readers not interested in the formal development may skim this portion and go on to the optimality equation, (-45) below. To formally develop the dynamic programming model, we must define the action space A, state space S, transition probability q{-\-, ■):SXA-->S, and reward function r(-, •) : SXA-^R (the TIME-SEQUENTIAL TACTICAL-ALLOCATION 89 real line^ (see Blackwell [2] and Strauch [9]). Let A= {1,2,..., N} so that taking action i at stage n means excavating mine i at that stage. Let A^= \z={zi, . . ., z^) : 0<Zi<z\, l<i<N} and put S'i = A^, the set of initial states. For each ieA and ^e[0, 1], define the function T(i, t)\ A^^A^ by T{i, t) z=z' where 2,' = {\.-t)Zi and z'j=Zj ior J9^i. If z is the state vector for the amount of gold at some stage (so that each mine I contains amount Zj >0) and mine i is worked at rate t, the new vector after that day's work is T{i, t)z (assuming the machine remains undamaged). For each n>2, let <S„={s„=(2S iu z\ i2, . . ., in-uf)- ^rn, l<m<n-l, i^tA, s-^+^eA^ and 3u„€[0, l]^z^+'=Tii,n, ujzf}. Thus Sn is the set of all possible histories for the process up to time n, since i„ is the action taken at time m and 2" is the vector of amounts of gold in the mines at time m. For notational convenience, if im=l, let ti_,^(i)=Um, where VmiO^^mil', h, •■ • ■, im-\) is the number of times that I occurs in ii, . . ., im, \<m<n-\. Thus ti,y^{i) is the value of the rate of return random variable for the j'm(Oth excavation of I, performed at time m. Let d be an isolated point which represents the state of the process when either the machine breaks down or all the gold is extracted; i.e., it is the state of the system when the process finally terminates. Then the state space -5=0 Sr,\}{d]. n=l We take S as the (topological) disjoint union of the Sn& together with the Borel sets generated by the topology. To define the transition probability g which determines the evolution of the process, let s„ e Sn, for some n>\, and let it A. Then tn, . . ., ij.^ci) denotes the observed values of pn, . . ., Pi, m(.D, respectively, which occur in s „, where m (i) = m (i ; s „) is the number of times mine i is excavated in s„, l<i< AT'. For notation, write Gi_rrm)+i{-\Sn)=Gi_m{i)^i{-\tn, . . ., ii.m«)) (where Gn{-\s„)=6ii) &nd 0i,mii)+iiSn)=Pi.ma)+iiin, • • •. <i.m(i)). Then pntq{d\s„, i) = l—0t.m(t)+dsn)- For B a Borel sub- set of A^, define (Al) 2({(S„, i)}XB\Sn, ^)^0^.mU)+l{Sn)Gi.mU)+^{{U[O, l]:^(^, t)ftB}\Sn). Thus g((S'„+i|s„, i) = l — q{d\sn, i). Lastly, set q{d\d, i) = l, so that once the system enters state d, it remains in d. Finally, to define the reward function r:SXA—*R, let s„eS„ and it A. Set (A2) riSn,i) = 2"^i.mU)+l{Sn)j tG i,rr>U) + lidt\s„) where (A3) 3<"=2<' n (l-<<^), l<i<N. Define r{d, i) =0, so that there is no reward once the system enters the "terminal" state d. In the above notation, Sn is the set of all histories of the system up to time n, and qid\s„, i) is the probability of breakdown of the machine after observing history s„ and then taking action i. If the machine does not break down, the new history is s„+i=(s„, i, T(i, i)z^)eSn+\, whose distri- bution is determined by (Al). The reward function specifies the average (expected) amount of reward we could receive if after observing history s„ we take action i. This is (A2), and (A3) is the equation for the amount of gold in mine i after m{i) excavations there. We have thus specified 90 G. J. HALL, JR. A, S, q, and r. At each stage n, we must make a decision based on the entire history s„. The decision is a function of s„, so we have a map (t„:S-^A such that at time n, (r„(s„) is our decision. A policy or decision rule is a sequence <r= {cr„}r of (measurable) maps. From Blackwell [2] and Hinderer [7], we know that there is an optimal policy of the form /^°°'= (/, /, ■ . •), wheveJ-.S-^A, since in the above formulation we have shown that ours is essentially a discounted dynamic programming problem. Let Vc denote the expected reward under policy a and let y=sup,F,r denote the maximal ex- pected reward from policy a, given history s„. Then (see [2], [7]) V satisfies the following optimality equation (or O.E.) : (A4) y(s„)= max 2<"/3,.„(<)+i(s„) tGi,,n(o+i{dt\s„) l<i<N [ Jo Notice V{d)=Q. If we define (L<F)(c?) = and {Ly){s„) to be the ith term on the right-hand side of (A4), then V{s)= max (i,F)(s) for all stS. l<i<N We will now prove Theorem 1 of Section 2, which is Theorem 3 of the appendix. We will show that at each stage n the locally optimal policy selects an action which achieves equality in the O.E. (A4). We will prove the theorem under assumptions (M) (i), (ii), since this is conceptually somewhat easier to follow than under assumption (C). The proof of the theorem under assumption (C) is almost identical. We begin by proving the following theorem. We drop the superscript from z^ and assume 2i>0, l<i<N. THEOREM 1: (a) In the sequential stochastic gold-mining problem, suppose that z is such that for all i, 2<i<N, and all m>\, ^ 13 m ^ ^i^Umiptl, ■ • •, Pi.m-l){ n [1 — Pij] )£'(pi,m|p.l, • • •, P(.m-l) (A5) ' > r ' a.s. 1—^11 i—0i.m{Pu, • ■ ., Pi.m-l) Then action 1 achieves equality in the O.E., i.e., V(z) = (LiV)(z)>(LiV)(2),for 2<i<N. (b) If z is such that (A6) Mi4(Pii)>ii^il^Pii)for2<i<A^ 1 Pn 1 Pn and (A5) holds, then action 1 is the only action achieving equality in the O.E., i.e., V{z) = (LiV) (2)> (ZiF)(£), for 2 <i<A^. PROOF: We show that for any policy (t={(t„}T such that ai{z)=h9^\, there is a policy a' = {(T'n}T such that 0-1 (2) = 1 and (A7) K-(2)-K(£)>(l-^,,)2i^n^(pn)-(l-^ii)2.|8.i^(pM). Since there is an optimal policy /">, (a) and (6) follow immediately. Notice that condition (A5) implies that if action 1 is the most attractive decision at state 2 and some other action is taken, then action 1 remains the most attractive. The proof of the theorem will depend on finding the conditional expected reward of a policy <x given a sample path {<i^}r=i of the random variables {p 0)7=1, l<i<N. Let {<o}r=i be a sample TIME-SEQUENTIAL TACTICAL-ALLOCATION 91 path of {pij}T-i, for each i, and let ii, ii, . . ., t^.i, i„, im+\, ... be the sequence of actions resulting from the use of policy a for this fixed path. Assume that im=l, but t*?^! for 1 <A:<m -1 (recall i^=hr^l), so that m is the first stage at which policy a selects action 1. For this fixed path let a' be the policy whose sequence of actions is 1, ii, 12, . . ., v_i, i^+j, i,n+2, ■ ■ ■• Notice that this determines a well-defined policy (T' = {<T^}n=i where o-^: S—*A, n>l. Let 2<k<m, and consider the difference of the conditional expected reward under the two sequences of actions ii, 12, . . ., ik-2, 1, it-i, 4+i, . . • and ii, iz, . . ., ik-2, t*-i, 1, 4+i, • • -, given the samplepath.Fornotation,let/3<^^(0 denote /3ij|., J |(<^j)(<iji,i, . . ., <<^ j,;,^ j(<^ p), andlet jtGi^Jdtll denote J^^'»-i-Vi('»-,>(<^^l*<»-rl' • • •' *'»-.. ».-i «»-,>)• Since the reward is the same under both sequences of actions at each stage except for stages k-l and k, the conditional expected diflFerence, given that the machine does not break down before stage k-l, is (A8) where Si", the amount of gold in mine i at stage n, is as in (A3). Thus (A8) becomes (A9) Zi0ujtGum{l-0uJf)}-2i,.,Q''nUi-ti,,,j]^0i,Jt)^ The conditional probability that the machine does not break down before stage k-l is (Alo) n U ^tj(tn,...,tu-i)=0''-Ht). 1=1 j=l Thus the conditional expected difference given the sample path is (AH) 2i0'-Ht)0n J tGn(dt){l-^,^Jl)}-Zi^_^(^''n'\l^^ The total conditional difference of incomes under o-' and a is found as the sum of differences (All) by moving action 1 each step one place further toward stage 1. Thus we obtain for the conditional expected difference m /• (A12) Z; (2i/3*-H0^u tGn{dt){l-fii^^it)} k=2 ~ J " 92 G. J. HALL, JR. Let Jk be the random variable obtained from the ^th term of (Al2) by substituting pij for tij and by substituting the action which a prescribes at time k for i*. Let f be the first time a chooses action L Although it is clear that we only need consider those policies a for which 2< f < oo a.s., we will allow the case where PaU= °°)>0, and note that for any sample path where m= «, (A12) gives the corresponding infinite series. Thus, if Ia is the indicator function of event A, and if J„=0, then / =°° m \ / °° = °° \ / °° \ (A13) K- (z)-V.{z)=E, ( T. /if="'i 1:Jh)=EAJ:J,^ /,r=™) )=K ( S JJu>k) ) ~ ~ \m=2 k=2 / \k=2 m=k / \t=2 / Here we have used the independence of the A'^ stochastic processes and (A5) to conclude that Ea{JkI\{>k])'>0 for all k, since I\{>k] only depends on pij for l<j<vic-2{i), l<i<N. Thus inequality (A7) is established. Since there is an optimal policy /'°°\ it follows that if F(2) = F/(»)(2) = Z,0niE{p,,)+0,, jVfi'^)iz,h, T{h, t)z)Gnx{dt) = Zn0niEip,r)+finijV{z,h, T{h, t) zJG,x{dt) = iL,V) (z) <V/-)'(£) = 2i/3n^(pn)+|8u JV/<«)'(2, 1, T(l, t)z)Gn{dt) <2x0nE{pu)+0njV{z^, 1, Til, t)z)Gn{dt) = {LiV)iz)<Viz). This establishes (a), and (b) is similar. Q.E.D. A sequence {Xj}]'=i of [0, l]-valued random variables satisfies the weak ntonotonicity condition if for each m>l, (A14) Eil-X,)>(^U Z,) E{l-X^^r\Xu . . ., XJ, a.s. We have the following theorem. THEOREM 2: Suppose that for each i, l<i<N, the random variables {pi^jf-i are such that {1 — p,^}"=i satisfies the weak monotonicity condition (Al4) and that (M)(ii) is satisfied. Let action io be such that ^ioKiE{pi^.i) 2t^nE(pa) where s is the initial state. Then action i„ achieves equality in the O.E., i.e., V{z) = {Liy)(z). PROOF: This follows immediately from Theorem 1. Q.E.D. The locally optimal policy may be formally defined as follows: DEFINITION: The Bellman-Kadane analogue in the sequential stochastic gold-mining problem is the following policy ^<°> = (gr, g,. . .): For Sn(S„, ^(s„) is any action i such that TIME-SEQUENTIAL TACTICAL-ALLOCATION 93 foraU^, l<h<N. For s=d,j{d) is arbitrary. THEOREM 3. Suppose that for each i, \<i<N, the random variables {1 — p^lT-i satisfy (M)(i) and (M)(ii) of section 2. Then the Bellman-Kadane analogue y'"' is optimal, i.e., F(s„) = yg(oc,)(s„) for all s„. PROOF. According to the optimality criterion of dynamic programming (see Hinderer [4] or Strauch [7]), it is enough to show that for almost all Snf^S, V {s ,^ = {Lg(,„)V) {s n) ■ We have already shown this to be true for n=\. Notice that for each m>\, [l — Ptj]J=m+\ satisfies the strong monotonicity condition on the event f m I n (1-Piy)>0 Thus, repetition of the proof of Theorem 1 at each stage n establishes the optimality criterion. Q.E.D. Remark. It is easy to see from the above that condition (C) of section 2 yields the same conclusion. BIBLIOGRAPHY [1] Bellman, Richard E., Dynamic Programming (Princeton University Press, Princeton 1957). [2] Blackwell, David, "Discounted Dynamic Programming," Annals of Mathematical Statistics, 36, 226-235 (1965). [3] Cox, D. R., and D. V. Hinkley, Theoretical Statistics (John Wiley and Sons, Inc., New York 1974). [4] Dantzig, G. B., et. al., "Targeteer," Rand Research Memorandum RM-2622. [5] Ferguson, Thomas S., "A Bayesian Analysis of Some Non-Parametric Problems," Annals of Statistics 1, 209-230 (1973). [6] Hall, Gaineford J., Jr., "Sequential Search with Random Overlook Probabilities," Annals of Statistics 4, 807-816. [7] Hinderer, K., Foundations of Non-Stationary Dynamic Programming with Discrete Time Par- ameter. (Springer-Verlag, New York 1970). [8] Kadane, Joseph B., "Quiz show Problems," Journal of Mathematical Analysis and Applications, 26, 609-623. (1969). [9] Strauch, Ralph E., "Negative dynamic programming," Annals of Mathematical Statistics 37, 871-889 (1966). THE SEARCH FOR AN INTELLIGENT EVADER: STRATEGIES FOR SEARCHER AND EVADER IN THE TWO-REGION PROBLEM D. M. Roberts Ministry of Defence London, England J. C. Gittins University Mathematical Institute Oxford, England ABSTRACT This paper considers the search for an evader concealed in one of two regions, each of which is characterized by its detection probability. The single-sided problem, in which the searcher is told the probability of the evader being located in a particular region, has been examined previously. We shall be concerned with the double-sided problem in which the evader chooses this probability secretly, although he may not subsequently move: his optimal strategy consists of that probability distribution which maximizes the expected time to detection, while the searcher's optimal strategy is the sequence of searches which limits the evader to this expected time. It transpires for this problem that optimal strategies for both searcher and evader may generally be obtained to a surprisingly good degree of approximation by using the optimal strategies for the closely related (but far more easily solved) problem in which the evader is completely free to move between searches. 1. INTRODUCTION Suppose a stationary object is hidden in one of N distinct regions. The probability of its being concealed in region i {i=l, 2, . . ., A^) will be denoted pi, and the location probability vector by P=iPi,P2, . . ■,Pn). Each region is characterized by its detection probability qt which is the probability that a search of region i will discover the object if it is there; to avoid unnecessary complications, suppose 0<5i<l. We assume that the time taken to search any region is constant, and take this constant to be the unit of time. From Bayes' theorem it follows that an unsuccessful search of region j changes the location probability vector as shown below. If) (1) Pr 1 Pi -^ z—^ — for all i T^j. 95 96 D. M. ROBERTS & J. C. GITTINS It has been shown by, among others, Black [1] that the strategy which at any time searches the region with the greatest current value of p^gi minimizes the expected search time. We shall refer to the problem just described as the single-sided problem. This is in contrast to the double-sided problem in which we remove the implicit assumption that the initial value of P is known to the searcher, and instead allow it to be secretly chosen by the object itself, which we now call an evader. The double-sided problem is a zero sum game between the searcher and the evader. The searcher's pure strategies consist of infinite sequences of integers between 1 and N, representing the sequence in which the regions are searched. Let A denote a typical sequence. The evader's pure strategies are the N regions in which he may hide, and once hidden he may not subsequently move. The initial vector P is a mixed strategy for the evader. The payoff is the time to detection, which the searcher seeks to minimize and the evader to maximize. It is this double-sided problem which we shall discuss in this paper. We shall also be interested in the closely related problem where the evader is able to move freely between searches. It is not diflBcult to show (see Norris, [3]) for this problem that the maximin strategy for the evader con- sists of hiding initially, and after each successive search, using the location probability vector P,,, defined such that piqt is the same for all i. The determination of the evader's maximin strategy P* for the first mentioned double-sided problem typically involves extensive computation. It is thus fortunate that, as we shall see, for many two-region problems P*=Po, and even when this identity does not hold, Pg is still a very good approximation to P*. This observation also enables us to obtain a good approximation to the searcher's minimax strategy. The double-sided problem with a stationary evader is an appropriate model for a number of realistic search situations. It may be used to explore potential searching strategies contra those involved in covert acts of terrorism, or to estimate where in a foreign country a concealed agent should be located, or where at sea a strategic submarine should be deployed. The approximation of P* to Po which we examine in detail for the two-region case appears likely to remain a good one when there are more than two regions, and in this sense the problem may be said, for most practi- cal purposes, to be solved. In Section 2 the main features of the two-region problem — first discussed by Norris [3] — are described. This is followed in Section 3 by an investigation of the relationship between Pg and P*. In many two-region examples they coincide; it will be shown that in general, this is more likely to be so when both detection probabilities are large. When both detection probabilities are very small, the proportional difference between payoffs at P* and P„ is also small, and by examining the continuous time analogue of the discrete problem we show that this feature is not unexpected. Finally, conclusions drawn from a large number of computed examples are cited to give some indication of the differences that one might expect between payoffs at P„ and P*, and these are then used to derive approximately optimal strategies for both searcher and evader. 2. THE TWO REGION PROBLEM The vector P which denotes a strategy for the evader now has just two components and may be expressed in the form P=(p,l-p). STRATEGIES FOR SEARCHER AND EVADER ^ Let V(A, P) denote the expected duration of the search (i.e. the expected payoff) when the searcher uses a sequence A and the evader plays P. If for sequence A we let a< denote the expected duration of the search if the evader is actually in region i (i=l, 2), then V{A,P)=a,p+a,{\-p), and we shall sometimes find it convenient to represent sequence A by the vector A==:{ai,(h). Let V{P)=ndV{A,Py, A clearly V{P) is both a continuous and a concave function of P. The optimal counter to any strategy P must be such that the sequence used after an unsuccess- ful search is optimal with respect to the transformed location probability vector existing at that stage. Writing ri=l— g^ (i.e. the escape probability associated with region i) and using the trans- formation (1) leads to the functional equation ,„, ,,,p, . (l+(l-J.(l-r,))F(P') where P transforms into P' or P" according to whether the first (unsuccessful) search is in region 1 or region 2. In general, (2) is not an easy expression with which to deal. However, it follows from the transformation (1) that if where Ui and n^ are both integers, then P transforms back into itself after Ui searches of region 1, and 712 of region 2. This means that the searcher's optimal strategy is cyclic: from (1) therefore (as pointed out in Norris [3], chapter 2) V{P) can be expressed in closed form and calculated relatively easily. This is equivalent to requiring the ratio of the logarithms of the escape probabilities to be rational. If this ratio is irrational there still exist rational numbers which are arbitrarily close to it, so the rational case is of more interest than one might at first suppose. In any discussion of the two-region problem, three location probability vectors are of great importance. The first of these is Pg defined, as in the introduction, as Po^{Vl'Pli^=PiQ.'i)- The other two are Pi and P-i which are related to Po thus Po^Pr, Po^P^- Pi and Pi may be regarded as points in R"^ on the line consisting of points of the form (p, 1 —p) . The interval P1P2 on this line is important, for it is not difficult to see that if the target chooses any vector P belonging to P\Pi. as his initial strategy (or if such a vector arises during an optimal sequence), then an optimal search sequence will not transform P outside of this interval, and for 93 D. M. ROBERTS & J. C. GITTINS this reason Pi Pa is termed the recurrent interval. An example showing the form of the recurrent interval for the problem ri=0.8, ni=3; rj=0.71554, 7^2=2, is drawn in Figure 1. 0-52' T— I r 0-56 ''o 0-60 P* 0-64 r r P2 0-68 Figure 1. Minimum expected duration of search as a function of the target strategy for the case: ri = 0.8, r2=0.7155 ni = 3, n2 = 2. 3. P„ AS AN APPROXIMATION TO P* Should the evader ever be concealed with probability vector Po, then it follows from the defini- tion of Po that the searcher has at his disposal two equally optimal strategies. Moreover, Po is a simply determined function of g-i and $2, and knowing that sometimes P*, the target's optimal strat- egy, is the same as P„ (as in Figure 1), it is natural that one should wish to ascertain precisely when Po and P* coincide and, more generally, how their relationship depends on the detection probabilities. Before examining this relationship we shall give two theorems. THEOREM 1: P*=Po when qi=q2. This follows directly from considerations of symmetry. STRATEGIES FOR SEARCHER AND EVADER 99 THEOREM 2 : When 7•l"'=r2''^ and n, and riz are integers such that ni=n2+l, then P*=P„ if l^n2< 12, and P*9^P„iSn2>12. We first prove this theorem for l^nj^^G. In the proof, reference will be made to the typical example shown in Figure 1. The vector P„ is marked and we shall be concerned with the slopes of function V{P) on either side of it. If P„ can be shown to be a local maximum under the stated con- ditions, this will be sufficient. Two preliminary lemmas will be required. LEMMA 1 : The function where 71 is a positive integer, and 0<x«\l. PROOF: First consider (]—y\(\—'~n(.n+l)) 1 —3. n-1 1 / n— 1 n-1 \ 1 X" \ m=0 m=0 / _ 1_ (l-x'''+ 2 (a;('«+l)(n+l)_a;m(«+l)+l)^ 1 — X \ 7n=0 / n— 1 Ji-2 '^"^ 2,mn 'K~* y,m(.n+l)+l m=0 m=0 It follows from (3) that (I tWI ~n(n+l)\ n-1 n (4) i+a.n(n+.)_2a;« V, „\,, „^i/ = l+3:"<"+"+2 Zi x'»<"+''-2 X) (.1 3; j (1 X ) m = l 77» = 1 Regrouping the positive terms on the right-hand side of (4) gives (5) X) {x('"-i'("+i>-|-a;'"(«+i'}— 2 X) a;'""=S {x<'"-'""+"— 2x'""-|-a;'"^"+*>}. m=l m = l 77! = 1 If we pair the mth term of this sum with the (n-|-l — m)th term, the right-hand side of (5) becomes (6) S {x('»-inn+l)[l_2a;n-m+l_j_^n+l]_|_2.(n-m)(«+l)[l_23;m^_pn+l]|_^3.(n2-l)/2(i__j.(n + l)/2)2^ l<m<n/2 the last term being present only if n is odd. This last term is always positive, so we are concerned with proving the sum of the remaining terms positive. Three possibilities exist. Firstly, both terms in the square brackets of (6) may be positive, in which case we can proceed. Secondly, both might conceivably be negative, whereupon, since j.{m-\)(.n+l)^j.in-m)(n+l) the whole expression exceeds "^"^ 2x*'"~*""'^" ( 1 j-i-m+i a;'"-l-x''"'''l l<7n<n/2 100 D. M. ROBERTS & J. C. GITTINS Thirdly, their signs might differ, in which case [l-2x"+x"+i]<0. So once again the expression (6) exceeds l<m<n/2 l<m<n/2 and this is visibl)^ positive. Hence the left-hand side of (4) is positive and this may be rearranged to give the required result. LEMMA 2: where l:^n:$6 and 0<a;<l. PROOF: The expression, which we shall refer to as/„(a;), may be rewritten in the form (l—j-) (l — y.n(n+l)) == — l+x"+a:"+'+a:"'"+"+2 J2 x'"<"+''-2 XI ^'""- m=l ra = l One of the factors oifn(x) is (1-x). We shall denote the other by y„(a;) and its form may be shown to be n 2n — l n Cm(n-fl) — 1 (m+Vyn—l gn{x) = -J2 a;'"-'-2x"+ S x'^-Z) Z) a^'- Z) ^^ m = l m = n—l m = 2 [ i=mn j=m{n+l) Forn=6: g^(x) = -l-x-x^-x^-x'-x^-2x^+x'+x^+x^+x'°+x''-x''-x'^ +x''+x''+x''+x''-x''-x''-x"'+x''+x''yx''-x''-x''-x''-x'' -Lnr^»-Ly29 7.30 ^31 ^,32 ^33 2;^4_j_™35 ^,36 y.37 „38 ^.39 „40 -41 This is clearly negative, and similar reasoning applies to values of n from one to five inclusive. And since the multiplying factor (l-x) is always positive, it follows that/„(x), too is negative for 71:^6. PROOF OF THEOREM 2 FOR l<7i2<6: Using the notation of Figure 1, we are obliged to show under the stated conditions that ^(^3) and V{Pi) are both less than V{P„). If we represent the searcher's optimal strategy U in the interval PgPi by the vector {ui, V2) as in Section 2, V{Po) ^ViPi) if ^'2>ifi- For the example of Figure 1, the sequence U consists of the repeating cycle 1,2, 1,2, 1, corresponding to the following successive transformations of the interval P0P4: Po Pi 'P1P3 ' PiPs — 'P3P0 — * P5 P2 — ' PoPi- In general, U will merely be an extension of this; namely 1, 2, 1, 2, . . , 2, 1 where in a complete cycle there are as usual (rii+Wa) searches. We can therefore write equations for Ui and V2 as follows: ^i=i:r;ril {ei(l+3ri+5ri^+ . . . +{n,+n2)r1^-') + {n,+n2)r,^^} STRATEGIES FOR SEARCHER AND EVADER 101 Since qi^l—Vi, The last term in each equation is common, as is the multiplying quotient, because ri"i=r2"2. So the difference {u2—Ui) is proportional to l+2(r2-ri)+2(r22-ri2)+ . . . +2ir^-'-r^^-')-2r^^-'+r,''\ or 1+ i-r2 l-n +'' ■ In order to show that this last term is positive, it will be sufficient to show that l+,,._2(i_.,«o(^-^j>0. If we make the substitutions n=W2, ri = a;", r'2=x''+^ this expression becomes ^ ^ Vl— a;" 1— a;"+V and this was shown in Lemma 1 to exceed zero for all relevant values of n and x. Thus V(Pi) <C V(Po) for all possible values of n and x. If we represent the segment P3P0 by the pair W= (wj, W2), its slope will be positive if Wj exceeds W2. In the example of Figure 1, the optimal strategy W consists of the repeating cycle 2, 1, 1, 2, 1, corresponding to the successive transformations P3P0-F5P2-P0F4-P1P3-AP5-P3P0. In general the sequence W will be, as previously, an extension of this; namely, 2, 1, 1, 2, . . , 2, 1, where, as usual, in a complete cycle there are (wj+^a) searches. We may use the same procedure as before to show that {W2—W1) is proportional to (7) l+r.''>-2(l-r,«') (^^-^^-(l-r:)-(l-r2), and it is necessary to show under the stated conditions that this is always negative if we are to prove that the slope of P3P0 is positive. Once again, if we employ the substitution n—n2, r^ = x", r2=x""'"\ then expression (7) becomes identical to the function/„(a') of Lemma 2; if n is less than or equal to six, this function is negative and the theorem is proved. Computation of the function /„(x) of Lemma 2 shows that the lemma is true for n<12, but not for 13 <n < 16. Moreover, it is easy to show analytically that/„(x) >0 when n > 17, and x— (|)''". ff These observations complete the proof of the theorem. 102 D. M. ROBERTS & J. C. GITTINS Significance of Theorems 1 and 2 The situation which these two theorems elucidate is portrayed in Figure 2, which shows the relationship between Pg and P* for one half of all the pairs (51, §2) ; the other half is symmetrical to the one shown. The straight line ri = r2 is shown as a continuous line to indicate that Po=P* for all values of 51 and q^ throughout its length — Theorem 1. By Theorem 2, the same holds for the curve r^ — Ti, and between this curve and rx = ri there are 11 other curves {rx—r-^, for example) on which Theorem 2 does hold. Between any two of these there are curves which are characterized by the feature that for the smaller values of (gi, q^, P* and P<, are not the same. One example is the curve r^=r<^, which appears as a broken line where P., and P* differ. Above the curve ri^=r2, the situation may be represented quite simply. The shaded portion of Figure 2 shows where P* and P<, differ; otherwise they are the same. It is interesting to note that where P,, and P* do differ, then their relationship is such that p*^po if ?'i>?'2, where as usual P=ip, l—p). There does not seem to be any obvious reason for this. How can one explain the characteristic that, generally speaking, for high probabilities of detection P„ and P* are equivalent, whereas they tend to differ at the lower detection probabilities? «-•" '-' f^ ^ O) Z^ ^ ji^ ^ / ^^^ ^/^y / d" y X ^ yy K / / '^Z CO ^ . '' / / y 'yy y y / y f X X j*/^ / « w^X ^ d' / // >< V A X J Y / / rv / ^ f V /y / // / 0' yy ^ / u> : ' / / // ^// / / d" / / y / A m * * ' / y ' / /j/ / 1.°- /^ y d" j.^ ' i / ' J/ y 10 '' 't /,V/7 / r d" LEGEND "l - 5, "2 . 1 I' '4/// n| - 4, "2 • 1 A + "1 - 3, - 5, "2 "2 _ I 2 uti/^// X n - 2. "2 - 1 fc 1, ///fy n. - 7, "2 - 4 H n n - 3, - 4, "2 " 2 3 X n - 5, "2 - 4 f j/y ♦ n - 1, "2 •• I dj L^ = P*; -- 'J^ ^ p*. T 1 1 1 r— *■ ^^ 1 1 o.n 0.1 0.2 0.3 0.4 0.5 1, 0.6 0.7 0.8 0.9 1.0 Figure 2. The relationship between P„ and P*. (In general, the area in which P„?^P* is shown shaded.) STRATEGIES FOR SEARCHER AND EVADER 103 The answer seems to be as follows. As already noted, if the evader could move between searches, he would always ensure that his location probability vector would be P^ . However, he is not free to move, but is permitted to choose an initial vector which becomes transformed according to (1) as successive searches take place. The evader should therefore choose P so that, in some average sense, the sequence of transformed location vectors is nearP,,. When the detection probabilities are high, the later members of the sequence are unimportant since they are reached with very small prob- ability, so the searcher can do no better than choose P^. However, when the detection probabilities are smaller, the search can be expected to last longer; the possibility exists that the evader can offset his initial probability distribution away from Pg in such a way that the initial advantage given to the searcher thereby is more than compensated for by the gain to be acheived by bringing subsequent location probability vectors nearer to Pg . As a typical example. Table 1 shows the relationship between V(Po) and V(P*) on the curve ri"'=r2. As qi decreases below about 0.65, P* ceases to be the same as Pg and the proportional difference between V{Pg) and V{P*) increases, reaching a maximum between 21=0.3 and gi = 0.4. As gi is lowered still further, the proportional difference decreases. Now suppose that the detection probabilities and the duration of a search period tend to zero in such a way that the expected duration of the search remains finite and nonzero. In the limit the problem becomes one in continuous time, and as we now proceed to show, for such problems P„ and P* are once again identical. The Two Region Problem in Continuous Time The two regions are now characterized by their detection rates Xi and X2. As before, the evader may not move, but the location probability vector changes by Bayes' theorem as the search proceeds. We indicate its value at time i by (piit) , p2(t)) : P,, is now defined to be (pi:piX.= constant). In the continuous time problem, it is natural to allow the searcher to divide his effort at any given time between the two regions. If we call the proportion of his total effort which he allocates to region i at time t, Ui{t) (uiit)+U2it) = l), the probability of the evader being detected in {t, t-\-dt), given that detection has not occurred before time t, is (8) [\lPl{t)Ul{t)+>^2P2it)U2{t)]8t + 0{dt). The continuous time analogue of the transformation (1) is (9) Piit)=piit)p2{t){}.2U2(t)-\:Ur{t)) = -p2it), where Pi{t) denotes the time derivative of Pi(t), This model has been used by Dobbie [2] to analyze a one-sided moving target problem. Equations (8) and (9) hold if Ui(t) is right continuous at t, although the argument which follows may be shown to require only thattti(i) be Lebesgue integrable. Once again we proceed to determine the evader's maximin strategy on the assumption that he may choose only his initial location probability vector P. So let us suppose that the searcher knows P(=Pi(0), p2{0)). If we take Xipi(0)>X22?2(0), which we may do without loss of generality, the searcher will search region 1 until time t when \Pi{t) = \2P2(t). For <>r, he will allocate his effort such that >^iPiit) = \2P2it). These rules are the continouus time version of Black's rule (see section 1) and follow from the process of taking the continuous-time limit of the discrete-time problem. From (9) it follows that -^1(0 = WC^i + M for <>r. 104 D. M. ROBERTS & J. C. GITTINS n (M CO OS lO 6? 1— ( CO CO lO OS CO 1-H to OS OS t>. o d d d d M CO (N 00 &? pa lO t^ o= (M ■* CO (N 00 Tf lO (N O d lO lO CN lO t- l> lO fe? •^ OS t^ ■^ CO IN CO 05 CO t> T»< o d CO CO CO lO o> 1— t fe? CO CO cs 03 o o IN •<*< 05 00 OS ■* o d (N IN CO CO CO j—t lO 65 05 t^ (N OS CO o o lO CTl CO T»< 00 o d csi (N <N lO CT> T-H O 6§ oo (N r(< .05 TJ< iCi 00 CO 05 o o iC o d IN IN d ^__^ lO o> o -*( 05 05 •* ■* O) O O CT> ■* ■* t^ 05 00 00 d d -^ -! o CO t>. o> 00 Oi 05 •^ Tj< OS ■* ■<*< CT> OS OS 00 CT) CO CO d d ■-1 '-< o OS OS OS OS OS OS OS 00 00 OS tH ■>*< OS 00 00 CTl OS lo kO d d -<■ ^ o 0^ 1 Q^ -^ jP * ^ 0^ Q, a. <S <> H^ ^ ^ STRATEGIES FOR SEARCHER AND EVADER 105 Thus, if the evader is in region 1 (i.e., ^i(O)=p,(0 = l), the detection rate (analogous to the hazard rate of renewal theory) is Xi for ^e[0, t) and XiX2/(X, + X2) for tt[T, oo), and if the evader is in region 2, the detection rate is for ^€[0, t) and XiX2/(Xi + X2) for te[T, 0°). This means that if the evader is in region 1 the probability density for the time of detection is Xiexp(— Xi<) 0<«<r, and while if the evader is in region 2, the respective densities are 0<K^, and To obtain the probability density for the time of detection when it is known only that the evader chooses region i with probability p<(0) (i=l, 2), we simply multiply (10) by pi{0) and (11) by ^2(0) and add. The expectation V{P) of the time of detection may then be written as an integral in the usual way if we use this density. In Bobbie's notation, this density is the negative of the derivative with respect to time of q{t). An alternative derivation involves solving the differential equations (2) and (3) of that paper with ai=a2=0, SindJ'{t)=Ui{t). We can therefore write for V(P) : F(P)=J^Xii)i(0) exp {-\,t)dt + {pdO) exp (-x,r)+2)2(0)} ^^ J^" t expj-^^ (i-r)|rf< =2>i(0)/X,+:P2(0){2/Xj+l/X2+r}, and it is easy to show that this expression is maximized when Xi2?i(0) = X2p2(0), i.e., at the point Po- Approximately Optimal Strategies Looking again at Figure 2, it is convenient to divide the range of ^i, §2 into two separate parts. Firstly, there is the area between the curves r-^^r^ and ri=r2 in which Theorem 2 is relevant. For the several hundred points within this area for which values were calculated, the inequality F(F*) <1.00075V(P<,) was found always to hold. Above the curve r^=r2, the proportional difference between these two increases as 51 and 32 become more widely separate from one another, though it still tends to be small. For instance, on the curve r^^^r^, it is always true that F(P*)<1.035F(Po). Moreover, for any two-region problem for which (51,22) lies between rx^—r2 and rx = r<i^, V{P*) <1.035F(P<,), and usually this inequality is very easily satisfied. These figures follow from the examination of a large number of points; however, they must be accepted with one reservation. This follows from the fact that we are in a position to calculate V{Po) only if the relationship holds; there seems to be no way of finding F(P) otherwise, so of course the possibility exists that the equivalence of P<, and P* over some portion of the curve r^^^r^, say, is due merely to the ration- ality inherent in the problem. Were this so, one might possibly expect to see some manifestation of 106 D. M. ROBERTS & J. C. GITTINS this along a curve very close to it {r{'^=r2, for example) but to date no unaccountable variation has been found. The evader may therefore for any particular practical application assess the advisability of calculating P* or of merely regarding Po as a satisfactory approximation. The latter will almost certainly be legitimate if (51, 22) lies between r^=r2 and rx = rt. Otherwise, one must weigh the advantage of knowing P* precisely against the extra effort required to ascertain it. The searcher also has two choices. In principle he can calculate P*, determine the two pure strategies which give V{P*) for the expected duration of the search if the evader plays P*, and choose randomly between them (see [3], Chapter 2) to ensure an expected duration of no more than this amount. Alternatively, he can simply assume that the evader plays P „ and follow one of the two search strategies which limit the expected duration to V{P ^ when the evader makes this choice. Let (ai, 02) be the coordinates (see section 2) of one such sequence : thus, if this is the sequence chosen by the searcher, the expected duration is no greater than max (ai, 02), however the evader plays. When r'i>r2, the searcher should select the sequence beginning 1,2, . . . Even if P* differs from P and the evader is in fact playing P*, this strategy is still preferable. There are exceptions to this rule, but they occur only over a very narrow range of values of r^ and r2, only when they are very nearly equal, and even then the difference in payoffs between this strategy and its alternative, the other pure strategy beginning 2, 1, . . ., is extremely slight. When this rule is followed, and P and P* are the same, a2^(i\, and at is the payoff which the evader does gain by actually hiding in region 2 — the worst thing that can happen from the searcher's point of view. Under these condi- tions, by considering the difference between the expected durations of the search for the searcher's two optimal pure strategies, it is easy to show that {l — r-^l^X — r-i"^) is an upper bound on the dif- ference between a^ and V{P ^ ; in practice, the difference appears to be invariably less than ){. Should the rule be used when P and P* no longer coincide, the worst thing that can happen to the searcher's aims occurs if the evader actually hides in region 1. However, the payoff function over the recurrent region is usually very flat, and computed examples show that for this case the maximum penalty which the searcher might incur by using a pure strategy seems always to be much less than 1/2. It is natural to wonder whether P<, remains a good approximation to P* when there are more than two regions. If true, this would be a result of wide practical significance. Preliminary investi- gations indicate that this is indeed so, and these will be reported on in due course. ACKNOWLEDGMENT This paper has benefited considerably from suggestions made by Dr. John Loxton of Trinity College, Cambridge, to whom the authors wish to express their gratitude. REFERENCES [1] Black, W. L., "Discrete Sequential Search," Information and Control, <§, 159-162 (1965). [2] Dobbie, J. M., "A Two Cell Model of Search for a Moving Target," Operations Research 22, 79-92 (1974). [3] Norris, R. C, "Studies in Search for a Conscious Evader," MIT Lincoln Lab. Tech. Rpt No. 279 (1962). THE QUEUE G/M/m/N: BUSY PERIOD AND CYCLE, WAITING TIME UNDER REVERSE AND RANDOM ORDER SERVICE Stig I. Rosenlund University of Goteborg Goteborg, Sweden ABSTRACT The busy period, busy cycle, and the numbers of customers served and lost therein, of the G/M/m queue with balking is studied via the embedded Markov chain approach. It is shown that the expectations of the two discrete variables give the loss probability. For the special case G/M/l/N a closed expression in terms of contour integrals is obtained for the Laplace transform of these four variables. This yields as a byproduct the LIFO waiting time distribution for the G/M/m/N queue. The waiting time under random order service for this queue is also studied. 1. MAIN RESULTS We present a closed expression for the busy period — busy cycle distribution in the G/M/l/N queue, (11). This jdelds the waiting-time distribution under queue discipline last in^first out (LIFO) in the queue G/M/m/N, (18). A recursive algorithm for the waiting-time distribution in this queue under random order service is also given, (20), (27), (28), and (29). 2. THE MODEL Consider the following balking model. There are m servers. Customers arrive with independent interarrival times, each with distribution function F, F{0)=0, and Laplace transform ^«-X e-"dF(t). Let ^"^ tdF{t). h: A customer finding on arrival k other customers in the system enters it with probability p^ and is lost with probabilit}^ l-pk- Here 2>/t=l for A:<m. The service times are independent and have the common density ju^""'- The queue discipline is nonscheduled, e.g. FIFO, LIFO, or random order service. The G/M/m/N queue (i.e., the G/M/m queue with a finite waiting room of size A''). is obtained by putting Pk=l for k<Cm+N and 2?*=0 for k>m+N. 107 ][08 S. I. ROSENLUND 3. PREVIOUS WORK IN THE AREA The GIMIm/N queue seems to have been introduced by Takacs [10], who gave the FIFO waiting-time distribution and the long run state probabilities of the embedded Markov chain. For m=l and general p^ the latter were obtained by Finch [5]. The pure loss system A^=0 was studied by Palm [6] and Takdcs [9]. None of these authors treated the busy period or the busy cycle. For the nonbalking queue where pic=^l for all k (or N= <» ) , however, these random variables were treated by Bhat [1], Cohen [3], and De Smit [4], who gave explicit expressions for their Laplace transforms. Recently, Bhat [2] indicated an algorithm for calculating the busy-period transform in the G/M/m/N queue. A simple closed form has, however, been missing; for m=l this is provided in the present paper by (11). For the special case F{t) = l — e~^', i.e., the M/M/m queue, special methods are available. See Rosenlund [7] for the busy period and the LIFO and random order waiting time in the M/M/m/N queue. 4. THE BASIC SYSTEM OF EQUATIONS Let us define rj= arrival time of the ith customer. Then ri, t2-ti, T3-T2, . . . are IID with c.d.f. F. ri(t) =:number of customers in the system at time t {r]{t)=0 for t<CTi) Y{t) =remaining length of time from t until first emptiness again Z{t) = remaining length of time from t until for the first time again a customer arrives to an empty system X{t) = number of customers whose services end in (t, t-\-Y{t)] V{t) = number of customers lost in {t, t-\-Y{t)). Put (F, Z, X, V) = {Y{ti), Z{ti), X{ti), V{ti)). Our interest here is in the joint distribution of these four variables: (the first) busy period Y, busy cycle Z, number of served customers in the busy period X, and number of lost customers in the busy period V. To get at it we use the standard embedded Markov chain approach, using supplementary distributions. For s,p>0 and 0<u,w<l /,(s,2>, u, w,) = £;(e-'^<^i'-''^'^>'it^<^-'w'''^''|7;(T,-fO)=A:). Here i>k but otherwise fk is independent of i, due to the memoryless property of the service dis- tribution. Evidently we ha,\e fi=E{e~'^~'''^u^w^). We will need the transition probabilities of the pure homogeneous death process with death intensity m min (n, m) at state n. These are ({'^)e-"''il-e->''y-\ 0<i<k<m (1) P*,i«) = m and give the probability that the number of customers present decreases from A: to i in a time interval oflength t during which no customers arrive. a^,i(s,p, u,w) = ^ THE QUEUE G/M/m/N 109 Put +^,_,w*-*+i f e-''+'>^'p,,^_,{t)dF{t), 2<i<k 0, ^>A:+l; his,p,u)=u'^j\-^^+^^'p,,oit)dF{t)+s j\-^'(^£e-^''p,,o{y)dyJdF(t)'j■ We have the basic set of equations (2) .f>c=^a,,J,+h, k^l,2,... . i = l We omit the laborious but rather straightforward probabiUstic-analy tic calculation leading up to it. By a series expansion for/^ in terms of a^, < and hj, one may show that (2) has a unique solution (/i./z, • • ■) under the condition that the system functions properly in the sense that it empties infinitely often almost surely (i.e., with probability one). When Pn=^ for some n this condition is fulfilled and (Ji, J2, ■ • ■, fn) is uniquely determined by the first n equations of (2). Hence we may represent/i as a ratio between two determinants according to Cramer's rule. For the G/M/ljN queue this representation will be exploited in section 6 to give the relatively simply expression (11). 5. EXPECTATIONS; LOSS PROBABILITY Differentiating both sides of (2) and putting (s, p, u, w) = (0, 0, 1, 1), we get linear systems for the conditional expectations EY{t i)\rj{T i+0) =k) , etc. The matrix of coefficients is /— {a^iiO, 0, 1, 1)}, while bic is replaced by: (3) r (l-p*.o(«)) {l-F{t))dt for Y a for Z k-J2i r°°P* i{t)dF{t) forX t=i Jo i:(l-Pi) r PKii.t)dF{t)iovV. i=l Jo From this we get E{Y), E{Z), E{X), and EiY). Note that Z is a sum of Z+F independent inter- arrival times, so that by Wald's lemma (4) E{Z)=a{E{X)+E{V)). IIQ S. I. ROSENLUND For the single server queue, moreover, F is a sum of X service times, and so (5) E(Y)=,jL-'EiX), m=l. The knowledge of EiX) and E{V) is very useful, because it gives us the long run loss probability. Indeed, this holds for any queue such that successive busy cycles (almost surely finite) and what happens therein are IID. Define Xr= number of served customers in the rth busy period yr= number of lost customers in the rth busy period M„=number of served customers out of the n first arriving. PROPOSITION: If EiX)<o., E{V)<^, then E{X) MJn- E{X)+EiV) PROOF: Fix an outcome in the almost sure event {{X,+ . . . +Xr)/r^E{X)} n {(^1+ . . . +Vr)/r-^E{V) ]. Put a = EiX)l(E{X)+EiV)) andw,=Xi+Fi+ • • • +Xr+Vr. Then M„=Xi-}- . . . +Xr, so MnJn,=={MJr)/{nr/r) ->a as r^oo. For nr<Cn<nr+i we have \Mn/n-a\ = \MJn-M„Jnr+M„Jn-a\ < {M„^^-M„^) /n,+M„^ (n.+j-n,) /n,^ + \MJnr-a\ = {Xr+i/r)/{nJr) + ({Xr+,+Vr+:)lr){MJnr)/{n,/r) + \MJn-a\->0 &sn (and thus also r) tends to «>. Q.E.D. It follows that E(V)liE{X)+E{V)) is the loss probability. Compared with the traditional method of approaching the loss problem through the long run (stationary) state distribution of the embedded Markov chain, the present method has definite advantages from conceptual and computational viewpoints. 6. THE G/M/l/N QUEUE We take m=l, 2^0=- • • =PN='i-, Pn+i=Pu+2= ■ ■ • =0, for some AT' >1. Define ^jv+i=det (/-{ai,J), the determinant of the (A^'+l) X(A^+1) matrix of coefficients of the system (2). Put Bn for the determinant obtained from A^ by replacing the first column with [bi, 62, • • •, b„]^. Then by Cramer's rule Putting ar=u' j r THE QUEUE G/M/m/N 111 we set up Bn- B.-= hi — Oo 62 1 — 0,1 —O'o 63 —a2 1— fli 6„_2 — a„_3 —dn-i ■ ■ ■ 1— ffli —do Ofi-l (^n-2 0,71-3 — a2 1 — di — flo — tta — a2 1— «! — wtto The cofactor of 6] is A„, since the first column of An is [1, 0, . . ., 0]^. For m=l we have Define the power series k = l which converges at least for 1 3 1 -^ 1 , since < 6^^ < 1 . Then by some calculations we get (6) J .s HUz{F{p)-F{s+p+n-nU2)) S + fi—fJiUZ Now let the determinant C„ be obtained from An by putting w=0. Define Ci = l. Splitting the last column of -4 „ in two parts we find An=Cn—waoCn-i, n>2. Since AnT^O also for ^=0 we have C„5^0. Since C„ is continuous in « and equal to one for w = 0, this implies C„>0. The continuity of An in w further implies ^„>0. Developing C„ along the last row we get i=l Thus l = Ci>(72> . . . >Cn-l>Cn> . . . >0. Putting c„=Cnao-" and c(2) = X) c„0" A (converging at least for |2|<ao=i^(s+p+M)) we have n-l Cn-l = ^Cl,iCn-u n>2, 1=0 n=2 "-" 71=2 =0 1=0 n = !+l Hence (7) C(2) = ^^ F{s-\-p-\-fj,—iiUz) —2 112 S. I. ROSENLUND whenever the power series converges and the denominator is 5^0. Let g{s-{-p, u; n) be the smallest solution X in (0, 1] of the equation F{s+p + n—iJiUx) = x. When {s, p, u) 9^ (0, 0, 1) or Ma>l it is known that 0<(/<l. The function c is analytic for |2 | <Cg, where its MacLaurin series converges. The series does not converge for z=g, because c{z)^^ as z-^g—. It follows that the radius of the circle of convergence for c{z) is g{s+p, u; n). We proceed in the same way with S„. Let Z>„ be obtained from 5„ by replacing l—ai — waa with 1-ai. Put A = fti- Then Bn^Dn-waoD„_i, n>2. On developing Z>„ along the first column we get i=l Putting rf„=Z>„ao~" we have Since 0</i<l also for w=0 we have 0<d„<c„, and the series n=l converges at least for \z\<ig. We get n (9) C?„=X1 biCn-i+\ i = \ and h{z) (10) d{z)- F{s-\-p-{-tx—nuz) — z From (6), (7), (8), and (10) we obtain finally by contour integration A A jl—wz) iF(p)~F{s+p-\-n—nuz))d2 1 X tin 1 r {l—wz)dz ~ ^''*-^'^'='2^+'(^(s + p+M-MW2)-2) Here 0<r<p(s+p, u; n). The integration is counterclockwise on a circle in the complex 2-plane. For A^=0 the first equation of (2) gives separately (12) E{e-'-^^u-wn= M^(-^(y)--^^(^+p+M)) . {s+n){\-wF{s+p+„)) Putting eo=l and, for 7i>l, '"=2^ f , , —^ (0<^<^(0, 1; M)), ^'^''^'^I='2«(^(m-M3)-3) THE QUEUE G/M/m/N 113 we get for the expectations (13) E{Y)=^,-He^+^-l){e^+^-e^)-' E{Z)=aeff+i{e!f+i—eff)-^ £:(Z) = (e;^+i-l)(e^+i-6^)-' E{V) = {e^+,-e^)-K The calculus of this section, consisting of operating on the determinants in an expression for the transform obtained via Cramer's rule to arrive at contour integral expressions, is parallel to the calculus in Rosenlund [8] on the busy period of a finite M/G/l queue. Incidentally, e„ coincides with the expected number of served customers in a busy period of the M/Gjl queue with n waiting places, service time distribution F, and intensity of arrival m (see Tomko [11], (2)). 7. THE G/M/l/a> QUEUE We can find E{e~'^~^^u^) for A^= oo by letting N-^ oo , because the sequence of probability distributions of (F, Z, X) with finite A^^ converges weakly to the probability distribution for infinite waiting room. The convergence is shown as follows. Assume myLa>\. Denote by Pn and P^ the probability measures defined for the queues G/M/m/N and G/M/m/ oo , respectively. Put C7= maxi- mum number of customers present simultaneously in the system during the (first) busy period (n, Ti + F). Let A=\{Y, Z, X)iB}, B be any Borel set in i?3. Then P„{A)^Pn{AV[ \U<m+N}) +PN<,AU[U=m+N]). We have PN{A[]{U<m-\-N})=PJAV[\U<m-[-N}), since until the waiting room becomes full the system behaves as it would have done with infinite waiting room. Further, PN{A^[U=m-\-N])<PN{U=m+N)=P„{U>m-[-N)-^Q as iV^oo, for U is almost surely finite when miia>\. Hence PN{A)^yP„{A). For m=\ we now take w=l, it>0, and (s, p, u)9^{0, 0, 1). From (8) then. lim /,= lim ^ n-dJd^^A Put gn=gn{s+p, u; n)=Cn-i/cn. Wc havc the LEMMA: It holds 0<^„_i<gr„<l for (s, p, u)^(0, 0, 1), w>0. PROOF: We use a bit of a sly dodge here. Specializing (11), we get E{e-'"') = l-il-Fip))/ (1 —QN+iip, 1 ; m))- Since the distribution function of Z decreases with N, this implies Qn-iip, 1 ; m)< gnip, 1; m) for pyo, m>0. The lemma then follows from the relation gn{s+p, u; n)=g„(s-^p-\-n (l—u), 1; ixu) and from the construction of c„ from the positive determinants A„ and C„. Q.E.D. Hence lim gn exists >0, and so gn—*g, the radius of the circle of convergence for c{z). It follows that^„<y<l. We will show that lim rf„/c„ exists >0. Then dn-Jd^=gnf^-'/^g. Cn-1 <*n 114 S. I. ROSENLUND Hence (14) lim /i= lim c?„/c„. W-o Now by (9) we obtain This gives n / n \ i=l \fc=n-i+2 / djc„<±b,g'-'<b{g)/g, i=l and so \im d„/Cn<b{g)/g. Now take r so large that for €>0 we have Z: b,g'-'<e. i = r+l Forn>r, It follows that dnlCn>iZbi( n gX i = l \(£ = n-i+2 / lim4/c„>i]6tlim(' A ^*)=i) 6.5*~'>ft (^)/£'- i=l \t = n-i-|-2 / 1 = 1 Thus lim dn/Cn=b{g)lg, and so by (14) and (6) (15) ^(,-..-..^x)^ M^(^(p)-y (^+y, u; m)) , ^^ ^^^_ A We have in the numerator of (6) used the identity F{s+p+iJ.—fiug)=g defining g. The result (15) agrees with (3.5) in De Smit [4] on specializing the latter to the single server case, where the amount of work during the busy cycle equals the busy-period length. 8. THE G/M/2/oo QUEUE For the many server case one may use in principle the same calculus as was used in sections 6 and 7, although the complexity of the expressions increases rapidly with m. The analogue of (11) for m=2 is too long to write out here. For the GIMI2J ca queue we obtained by letting N-^<^ (16) (^-c-'^-^^u^)- i^uiF{p)-hs+p+i^)) 2nu'F{s+p+n){Fip)-g{s+p,u;2f,)) (s+m)(1-2«^(s+P+m)) {s+2fjL-2^^ug{s+p, u; 2,x))il-2uF{s+P+fjd) m=2, 7V=oo. The busy-period and busy-cycle distributions for G/MI2I co were found by Bhat [1]. See also De Smit [4], eq. (3.5), (3.8), and (3.9). The joint distribution of Y and Z cannot be characterized by these earlier results, though, and so the result (16) seems to be new. 9. WAITING TIME IN THE FINITE QUEUE UNDER REVERSE ORDER SERVICE Consider a customer who arrives at a G/M/m/N queue with service discipline LIFO and finds k customers before him in the queue {m + k in the system), k=0, . . ., A^^l. Then after his joining THE QUEUE G/M/m/N 115 the queue there are N-k-l waiting places left, and he has to wait until for the first time again there are k customers in the queue before commencing service. The process of served customers leaving the system during the waiting time takes place as from a G/M/l/N-k~l queue with service time density mjue"""'. Accordingly, the waiting time W is distributed as the busy period of this queue, conditional on k. For {p, u, w) = iO, 1, 1) we denote by hff.^is) the transform given by (11) and (12). With P,=limP(„(T,-0)=A:), it is then clear that for the long run (17) N-l m+k. k=0 ^m\ • • • "I-l m+AT-l "'N-k-X.miiis) ■ We define W to be zero for a lost customer. Now from Takdcs [10], eq. (8), we have Pm+k=Aqi,_^, 0<k<N, where ^ is a constant ([10], (10)) and q^ are defined by A •'"^ F{mfi—mfjiz) — z (Then qo=l SindA=Pr„+^=EiV)/{EiX)+E{V)), by the Proposition of section 5.) By (11), (12), and (17) we finally get, putting for brevity ^=mn, for s>0 (18) N E{e-"^\Wyo)=^ f - {l-F{0-0z))dz z^{Fi0-0z)-z) J,l-r2^ dz 2iri (Fi0-0z)-z) 0{l-z)a-F{s+0-l3z))dz zi (s +^-/3 2) {F{s +0-^z) - z) f - {l-F{s-\-0-0z))dz z^{Fis+0-0z)-z) J O<r<gis,l;0). For the unconditional distribution we have (19) E(e-"^) = 1-A({2iri)-' £ — dz z'^{Fi0-0z)-^') l)il-E(e-''^\W>0)), 10. WAITING TIME UNDER RANDOM ORDER SERVICE IN GIM/m/N For the queue G/M/mjN we shall give an algorithm for the waiting-time distribution under random-order service. Consider a time epoch r^ when a customer arrives and let Wi be his waiting time. Let Pk and 0=my. be defined as in section 9 and let >F be a random variable with the limiting distribution for Wt as i-^ oo . Put g^As) = E{e-''^^\-n{r^-Q)^m+k-\), k=l, . . ., N. 116 S. I. ROSENLUND Then (20) £(e-"^|lF>0)-S Pm+k-x{Pm+ . . . PM+,,-l)-'gNAs), while E{e-'^) is given by (19). Now the transforms Qn. k can be solved from a linear system reminiscent of the system (2) . In this context, let Fi, F2, • • • be a sequence of independent variables exponentially distributed A\ath mean 1//8; this will represent the beginning of the sequence of interdeparture intervals starting at Ti. Put S'o=0, *S'^=Fi+ . . . -\-Y j. Conditioning on the interarrival period r^+i — ti we then have, defining Qn.n+i^^Qn.n and r— 1 z;=o, (21) Eie-"^<\n{T,-0) = m+k-l, r,+ -u=t)=T, P{S,-r<t<Sk-r+,) r = l •[^ e-"^;v,.+,(s)+£ I E{e-''.\S,.r<t<S,^r+,)^^£ ^^I '"'' (^ I E(e-'''\S,=x)yx. Integrating in t with respect to F we get a linear system. The expressions seem unwieldy but can be much simplified. First we note that Sj represent the points of a Poisson process, and it is known that the successive points occurring in (0, t) are, conditional on the number of points in (0, t) being k—r, distributed as an ordered sample of size k—r from the uniform distribution on (0, t) Hence Sj/t has under this conditioning a beta distribution with parameters j and k —r —j-\-l. Furthermore, SJSk has ior j<^k a beta distribution with parameters j and k —j and is independent of Sk- So we can express (21) in terms of integrals in y of e'"" and e"*^", respectively, over (0, 1) with respect to beta distributions. Next we expand e'"" and e^""^ in MacLaurin series and integrate term by term in y over (0, 1). Then we integrate term by term in x over (0, t). Rearranging sums and doing some further mathematical footwork we get (22) i: P{Sk-r<t<Sk-r+i) £ I E{e-'^'\S,_,<t<Sk-r+x)-hj^ ^ |f^ s'^^ (jz \ E{e-'^Sk=x)^ dx =1; {f^, e-^' £ I j; ^(,4^, .-»'^.-(i-.)^-.. HUU ki\{i^k-r)\{2-\)\ '^nUrHi H!(i-l)!/3'r! ' 4 1 [C-f J (-g ^ «-'■) -g (4^)'" (¥ — i «) (t)'"' ^' -")] THE QUEUE G/M/m/N HY Let us define the functions Jo ri (23) <p=l3/{s+0) For s=0 define hk as the limiting value r=0 «^ Here a^ is defined as in section 6, except that m is replaced b}^ (3 and u=l, p = 0. From (21) and (22) we get the linear system *+i {—I (24) 9N.k=^~-i—(ik-i+igN.i+hK, Jc=l, . . ., N, 1=2 K with the agreement gN.N+\=9N.N- This system can be treated in much the same way as the system (2) was treated in section 6 for the G/M/l/N queue. Let Rff be the NxN determinant of the system (24) with element Oki T;— dk- i+1 (define a,=0 for r<0) in row k and column i for {k, i) 9^ (N, N) and element . A^-l N di—ao in the right bottom corner. Also define 7^ with element 5*i JT' O-k-i+l- i-\ k Then Expanding Ty along the last row gives the recursive formula [^0=1 (25) L, ^ ^N-i T^=T^-,-^-j;fa,ai-'T^-u N=l.2. Let further Hj^^ be obtained from Rff by replacing the Arth column with [hi, . . ., h^f]^; then by Cramer's rule gN.k=Hj^.k/RN, k=l, . . , N. Let t/;y 4 be obtained from 7^ in the same way, then 1][8 S. I. ROSENLUND with the agreement f7jv_i.Ar=0. Expanding also U^.k along its last row gives (26) i ^^.*=f^^-i.*-g ^^AT-.aS'-'-'C/i.^+^i.ao^-^^-t-i + '^ir(iN-i<io-''U,-x.i+i, N=\, 2, . . .; ^=1, . . .,N. i=0 -iV To avoid unnecessary powers of a^ in the formulas we introduce tn=(h "Tn and Un.k^ao "£/„.i. We can then represent the solution of (24) via the recursive algorithm (a, and h^ given by (23)) ^^^^ tn=ao-' r«„-i-S - a,-iti]' n=l, 2, . . . I L i=0 «- J (28) u„.k=ao~^ u„-i,k—^ - an-iUi^k+hJk-i +X) - G^»-if^t-i. i+i ' n=l, 2, . . .; A:=l, . . .,n L i=kn i=0 'i J (29) gN.k = iU!,.k-U:,-i,,)/{t:,-t^-i), N=l, 2, . . .; ^-1, . . ., A^. It is possible to obtain an explicit contour integral expression for tn- (It does not seem worth- while to pursue this line for w„,t.) Define t{2) = j:tn2\ n=0 Then from (27) n=l re = l ! = l ! = 1 n=i This differential equation has, for t{0) = l, the solution «(3)=expfp^^ ^ \, 0<z<g{s,l;l3), and so we can represent tn in the form ^^^^ ^"^ini^ '~"~'"^P C^^ — r^' 0<r<^(.,l;/3). For iV=l the solution (29) is trivially ^,_i=^=/3(s+/3)-'. For N=2 the solution is obtained via the expressions ti=ao~^ t2=ao-Hl-aj2) Ui.i=(paQ~^{l—ao) W2.i=v'ao~'[(l-ao)(l+ao(l+<p)/2)-ai/2] U2.2=<pao-mi-ao) {l+<p)-ai]/2. THE QUEUE G/M/m/N 119 Further, for A''=2 we have from (20) and section 9 where 5,=ao(0)-Hl-ao(0)) ff2=ao(0)-2(l-ao(0)-a,(0)). For larger values of A'^ the solution increases rapidly in complexity. REFERENCES [1] Bhat, U. N., "The Queue GI/M/2 with Service Rate Depending on the Number of Busy Servers," Annals of the Institute of Statistical Mathematics 18. 211-221 (1966). [2] Bhat, U. N., "Some Problems in Finite Queues," in Mathematical Methods in Queuing Theory, pp. 139-156. Lecture Notes in Economics and Mathematical Systems 98. (Springer-Verlag, Berlin 1974 (1973)). [3] Cohen, J. W., The Single Server Queue. (North-Holland, Amsterdam (1969)). [4] De Smit, J. H. A., "On the Many Server Queue with Exponential Service Times," Advances Applied ProbabiUty 5, 170-182 (1973). [5] Finch, P. D., "Balking in the Queuing System GI/M/1," Acta Matematica Academiae Scientiarum Hungaricae 10, 241-247 (1959). [6] Palm, C, "Intensitatsschwankungen in Fernsprechverkehr," Ericsson Technics 44^ 1-189 (1943). [7] Rosenlund, S. I., "On the M/M/m Queue with Finite Waiting Room," Bulletin de la Soci6t6 Royale des Sciences de Liege 44, 42-55 (1975). [8] Rosenlund, S. L, "Busy Period of a Finite Queue with Phase Type Service," Journal of Applied Probability 12, 201-204 (1975). [9] Takacs, L., "On the Generalization of Erlang's Formula," Acta Mathematica Academaie Scientiarum Hungaricae 7, 419-433 (1956). [10] Takacs, L., "On a Combined Waiting Time and Loss Problem Concerning Telephone Traffic," Annals Universitatis Scientiarum Budapestinensis de Rolando Eotvos, Sectio Math- ematica, 1, 73-82 (1958). [11] Tomko, J., "A Limit Theorem for a Queue When the Input Rate Increases Indefinitely," (in Russian) Studia Scientiarum Mathematicarum Hungarica 2, 447-454 (1967). WEIBULL TOLERANCE INTERVALS ASSOCIATED WITH MODERATE TO SMALL SURVIVAL PROPORTIONS FOR USE IN A NEW FORMULATION OF LANCHESTER COMBAT THEORY* Nancy R. Mann Science Center, Rockwell International Thousand Oaks, California ABSTRACT Given herein is an easily implemented method for obtaining, from complete or censored data, approximate tolerance intervals associated with the upper tail of a WeibuU distribution. These approximate intervals are based on point estimators that make essentially most efficient use of sample data. They agree extremelj' well with exact intervals (obtained by Monte Carlo simulation pro- cedures) for sample sizes of about 10 or larger when specified survival proportions are sufficiently small. Ranges over which the error in the approximation is within 2 percent are determined. The motivation for investigation of the methodology for obtaining the ap- proximate tolerance intervals was provided by the new formulation of Lanchester Combat Theory by Grubbs and Shuford [3], which suggests a WeibuU assumption for time-to-incapacitation of key targets. With the procedures investigated herein, one can use (censored) data from battle simulations to obtain confidence intervals on battle times associated with given low survivor proportions of key targets belonging to either specified side in a future battle. It is also possible to calculate confidence intervals on a survival proportion of key targets corresponding to a given battle duration time. I. INTRODUCTION In the more usual analyses of life data, the parameters typically of greatest interest are dis- tribution percentiles in the lower tail of the associated life distributions. These percentiles, in reliabilit}'^ analysis, are values of reliable life corresponding to specified high survival probabilities. In contrast, if a probability distribution is assumed for time-to-incapacitation of key targets as in the new formulation of Lanchester Combat Theory by Grubbs and Shuford [3], then it is very often the upper-tail percentiles of the distribution that are of primary concern. There are, doubtless, other life-time distributions in which times associated with low survival probabilities are often of concern. For example, this might be true of distributions of particular types of medical data or distributions of data collected to demonstrate longevity of a material or piece of hardware. * This research was supported by the Army and the Navy through the Office of Naval Research under Con- tract N00014-73-C-0474 (NR042-321). 121 122 N. R. MANN Here we assume a two-parameter Weibull model for time-to-failure or time-to-incapacitation, as suggested by Grubbs and Shuford [3], and we consider the problem of obtaining confidence intervals for distribution percentiles associated with the upper tail of the distribution, or specified moderate to low survival proportions. Alternatively a "mission time" might be specified and a confidence interval desired for a corresponding survival proportion. For Lanchester Combat Theory, such confidence intervals could be obtained from data generated during computer simulation of a sample engagement that is, as Grubbs and Shuford [3] remark, . . . representative of the hypothesized general characteristic? of many battles in the supposed environment. In particular, for example, we may be interested in running a sample simulation of a combat situation in order to see whether or not it is likely that our choice of weapons, the tactics employed in using them, and certain command-and-control principles would overwhelm and defeat an enemy with somewhat diflerent weapons capabilities in the same hj-pothesized battle environment. If a Weibull model is assumed for time-to-incapacitation, or time-to-neutralize, for key targets of each of the two opposing sides in a fixed combat situation, then the simulated "observed" times of incapacitation of targets on each of the two sides can be used to estimate the two sets of dis- tribution parameters. If a survival proportion is specified for either side, then a time corresponding to this proportion can be estimated as a function of the parameter estimates. If, alternatively, a "mission time," or "battle duration time," is specified, corresponding survival proportions can be estimated for both sides from the two sets of parameter estimates. In the following, an approximation that can be used in conjunction with the parameter esti- mates for obtaining confidence intervals on survival proportions corresponding to specified battle duration times or on duration times corresponding to specified survival probabilities is described. II. A WEIBULL MODEL FOR TIME-TO-INCAPACITATION It is assumed here that the random variate time-to-incapacitation T is such that f 1-exp [-{t/sy], t^O [ 0, otherwise, where 5, I3>0. For X =\n T, r)=\n 8, and ^=^-'>0, this is equivalent to (2) P(Z<:r) = l-exp {-exp [(x-r,)/^]}. The random variate T has a Weibull distribution with shape parameter /3 and scale parameter (characteristic time-to-incapacitation) 5. (The parameters /3 and 5 will of course tend to be different for the two opposing sides in a battle.) The random variate X has the first asymptotic distribution of the smallest extreme (the extreme-value distribution). The parameter 77 is the mode and ir^/y/'d is the standard deviation of the distribution of X. The linear estimates discussed in the following are based on observations on X ordered from smallest to largest. Grubbs and Shuford [3] reference three papers containing tabulations from which one can calculate confidence intervals, for each of the two opposing sides, on the distribution parameters, on high survival probabiHties of key targets, or on lower-tail distribution percentiles. These are based on (iteratively obtained) maximum-likelihood estimates, best linear invariant (BLI) esti- mates, or approximations of BLI estimates. One of these three papers (Billman, Antle and Bain[l]) applies only to censored samples. The tables given are extensions of those published by Thoman, WEIBULL TOLERANCE INTERVALS FOR COMBAT THEORY 123 Bain, and Antle [17] applying to maximum-likelihood estimates obtained from complete samples. The paper referenced in Grubbs and Shuford with tables for obtaining WeibuU confidence intervals from best linear invariant estimates of Mann [7, 8] is by Mann and Fertig [11]. The tabulations given therein apply to samples of sizes 2 through 25, either complete or with all possible right-hand censorings. Tabulations of Johns and Lieberman [4] pertain to asymptotic approximations of best linear invariant estimates and four right-hand censorings for each of six sample sizes ranging from 10 to 100. These published tabulations have all been generated by means of Monte Carlo simulation techniques, since the exact distributions of the estimators cannot be determined. The estimators to which they apply are all asymptotically fully efficient; that is, as sample size n approaches infinity the variances of the (asymptotically vmbiased) estimators approach the Cramer-Rao lower bounds for variances of regular unbiased estimators. The parameter of primary interest in the present context is tp=exp (xp), the 1 OOP th percentile of the distribution, where P is the proportion of key targets incapacitated by time tp. The propor- tion P incapacitated will, of course, tend to be different for each of the two opposing sides for fixed tp. Note, from (2), that Xp=i7+^ In [— ln(l— P)]. If P is specified, then essentially optimum point estimates of tp in terms of mean squared error in the log space can be obtained for targets on each of the two sides as exp {77-f^ln [—In (1 — P)]} or exp {»? + $ In [—In (1 — P)]}, where for each side, ^^ ^w A ^ 11 and i, are best linear invariant estimates and i\ and ^ are maximum-likelihood estimates of 7? and ^, respectively, and In [—In (1 — P)] is the lOOPth percentile of (X—r))l^. If a battle duration time <o = exp (xo) is specified, then exp {—exp \{Xo — f\)\^\ and e.xp { — exp [(Xo— ■^)/^]} provide efficient point estimates of 1— P(/o), the proportion of targets on each side surviving incapacitation until at least time to. As noted, the tabulations mentioned earlier apply only to the calculation of confidence bounds on lower-tail distribution percentiles or on high survival probabilities. Because of this and because these tabulations of exact values were necessarily generated by Monte Carlo simulation procedures, one would expect that similar simulation procedures would be necessary in order to find values to be used in calculating confidence bounds on upper-tail distribution percentiles or low survival proportions. We show now that such is not the case. III. APPROXIMATIONS FOR CONFIDENCE INTERVALS ON BATTLE DURATION TIME Recently, Englehardt and Bain [2] and Mann and Fertig [12] have shown that efficient linear estimators of ^ (those with smallest or nearly smallest mean squared error) are approximately proportional to chi-square variates. Thus, using the two-moment fit of Patnaik [15], one sees that if ^r.n is an unbiased efficient linear estimator of f, based on the r smallest observations from a sample of size w, with variance Cr.n^, then 2^*„/(Cr,„0 is approximately a chi-square with 2lCr,n degrees of freedom. If ^*.„ is the unique best (uniformly minimum variance) linear unbiased estimator of ^, then ?r,n=^*.n/(l + C'r. n) is the bcst liucar invariant estimator of t (The estiniator ^^.n has a mean squared error independent of t) and a uniformly smaller mean squared error than that of ^* n) As can be seen from the results of Lawless and Mann [6], %r. n is very nearly equivalent to the iteratively obtained maximum-likelihood estimator ^r. n of ^ This near equivalence can be seen, too, if one compares tabulations of distribution percentiles of \r.n\i and of ^/^r. n appearing in Mann and 124 N. R. MANN Fertig [11] and Thoman, Bain and, Antle [16], respectively, keeping in mind that both sets of Monte Carlo-generated tabulations are correct to within about a unit in the second decimal place for the values of n compared. Hence 2(1 + Cr. rdlr. J (Or. n& and 2(1 + Cr,„)^r n/(Cr,n^) are both approximate chi-square variates with 2/Cr,n degrees of freedom. This fact allows one to calculate confidence bounds for ^ by using either iteratively obtained maximum-likelihood estimates or best linear unbiased or best linear invariant estimates (or efficient approximations thereof) in conjunction with values of Cr,n obtainable from tabulations in, for example, Mann [7, 8], Engelhardt and Bain [2], Mann, Schafer, and Singpurwalla [14], and Mann and Fertig [12]. Engelhardt and Bain also provide, for various values of r/n, approximate expressions for 6V « that are quadratics in 1/n. Mann, Schafer, and Singpurwalla [14] use the chi-square property of 2^*_„/(Cr, n ^) to construct an approximately F-distributed statistic for obtaining confidence intervals on tn (reliable life, or in this context, battle duration time). First, consider best (or approximately best) linear unbiased estimators ??* „ and ^* „ of ri and ^, respectively, with variances Ar,n ^ and Cr.n ^ and covariance 5r. „ ^. Best (or approximately best) linear invariant estimators of r? and ^ are then ^rn = ^*.n-^*,n5.,„/(l + C.,„) and lrn=er.n\{\^Cr,r). Define X*. „ as [rj* „- (B,, „/C,, „) ^*, „] = ^r. „- {Br.nl Cr.n) h. J with variance f {Ar. „- Bl. n/Cr.„) . The covariance of Xf.n and ^*„ is given by [Br.n— (Br.n/ Cr.n) Cr.„]^'^=0. Mann, Schafer, and Singpurwalla [14] give examples of comparisons with exact, Monte Carlo-generated, previously- tabulated values to show that for small values of P one can consider (3) iXl„-Xj>)/{{-Br.JCr.n-\n[-\n{l-P)]}tr.n} to have an F distribution with (4) V, = 2{Br.n/Cr.n + \n [-In (l- P)]}V {Ar. n~ B^nf Cr. n) and (5) V2 = 2/Cr.n degrees of freedom. Note that both ^*.„ and {Xt.n~Xp)/{-Br.n/Cr.r.-^n ["In (1-P)]} are unbiased estimators of ^; with a two-moment fit to chi-square, each is, under the proper con- ditions, an approximate chi-square over its degrees of freedom. The value of vi is, like the value of V2, obtained from the two-moment chi-square fit {2m{X*—Xp)/v is approximately chi-square with 2mVu degrees of freedom, for E{X* — Xp)=m and var (X*—xp)=v) and can be calculated from tabulated values. Values of yl,, „ and Br,n can be found with tabulations of Cr.n- Additional values of Ar.n, for large n, are given by Mann, et al. [14] p. 252). Mann [10] gives the rationale for the chi-square fit for the numerator and gives ranges in terms of vi and V2 for which values for obtaining confidence bounds on Xp from the /^-approximation (3) are in error by one percent or less for P^Q.Ol, 0.05, and 0.10. In most cases the approximation is excellent, even for n or r as small as four unless censoring is extreme. Lawless [5] considers lower confidence ^bounds on Xp (for 7->=0.01, 0.05, and 0.10) based on (3) with Z*,„ replaced by Vr.n— {Br.n/Cr.n)^T.n. a functlou of tho maximum-likclihood estimators jj^,™ and ^r.n. and ^*.„ replaced WEIBULL TOLERANCE INTERVALS FOR COMBAT THEORY 125 A by (l-\-Cr,„)^r.n- He shows that these agree to two or more significant figures with exact values for sample sizes 25 through 60 unless nearly 90 percent of the sample is censored; i.e., r/n^^O.lO. Mann [9] demonstrates the excellence of a similar ^-approximation in obtaining prediction intervals for future samples, or lots to be manufactured in the future. Thus, for a specified small to moderate survival proportion 1-P applying to either of the opposing sides in a battle, one might conclude from (3) that an approximate lower (1-a) -level confidence bound on the corresponding battle duration time tp=exp{Xp) can be calculated as (6) exp{X*,„+F.{v„,2){Br.„/Cr.n+\n[-]n{l-P)]]^*}, where Fa(vi, V2) is the lOOath percentile of i^with vi and 1^2 degrees of freedom and vi and V2 defined by (4) and (5), respectively. An upper (l-a)-level confidence bound on tp is given by (7) exp{Z*„+7^i_„(.,, .2){5,,„/C,,„+ln[-ln(l-P)]}r} A two-sided confidence interval on tp at confidence level (l-2a) is an interval with lower and upper bounds given by (6) and (7), respectively. It is important to note that for the cases studied in [5] and [10] with P such that ^(Z*— »;)/^>ln [-In il-P)],Fa{vi, V2) is replaced hy Fi_aivi, V2) in (6) and Fi-a{vu V2) by F„,(vi, V2) in (7). Recall, for use of 7^ tables, that Fdvi, j/2) = l/-Fi-o(i'2, fi)- In general, however, vi and 1/2 will not be integers so that one can interpolate in tables of percentiles of F, or can use the approximation suggested by Mann and Grubbs [13] to evaluate Fy{vu V2) for given 7. It should be clear from inspection of expressions (3), (4), and (5) that if Xp is specified and a confidence interval is required for P (or equivalentl}^ for In [ — In (1 —P)]), it must be computed itera- tively since P occurs in both (3) and (4). In this case we find that a lower confidence bound for P at level 1-a for specified Xp=ln (tp) is given by 1 — exp { — exp {(av— X*,„)/[^* „Fi_„(i'i, 1^2)] — Br. JCr, n} } , whcrc vi is a function of the value of P which is being determined. IV. PRECISION OF THE APPROXIMATION A Monte Carlo simulation study was undertaken so that the precision of the /^-approximation (3) could be determined for values of P of interest. The values of P considered were 0.75 (0.05) 0.95 and 0.99 and the percentiles of the appropriate /'-distributions tabulated and compared with Monte Carlo values were 0.01, 0.02, 0.05 (0.05) 0.95, 0.98, 0.99. The values of n and r considered were n = 8, r=4, 8 and n=15, r=5, 10, 15. For r/n fixed, one would expect precision to increase with increasing sample size n since the basis of the approximation is an asymptotic result. Monte Carlo sample size for the study was 10,000. It was found that the precision of the approximation increased with increasing P over the range considered and with a limited amount of increased (censoring for a fixed sample size. In both cases the increase in precision resulted apparently from an increase in vi relative to V2- Results indicate that agreement of exact percentiles of (3) and those bused on the /''-approximation is to within about two percent over the range of percentiles from 0.01 to 0.99 for j',^0.3^2+20.0, P2^S.O 126 N. R. MANN and over the range of percentiles from 0.10 to 0.99 for vi: :0.3.'2+4.0, 1/2^8.0. From simulations of Mann [10], applicable when E [(X* — 77)/^] >ln[— In (1— P)], it can be seen that the F approximation is correct to within approximately two percent for P about 0.5 or less, when i-i >0.8f2. Shown in Table I are the results of 6 of the 30 independent simulations performed. Table I. Approximate and Monfe Carlo (M.C.) Values of 1007th Distribution Percentiles oj the Approximate Y-Variate (3). „=15, r=] 10, P=0.85 ,1=15, r= 5, P=0.85 Ji= 15, r = 5, P=0.75 ^1=15.0 .-2=22.4 ^, = 23.2 .'2 = 8.8 .-,= 19.1 V2 = 8.8 7 M.C. Approximate M.C. Approximate M.C. Approximate pcicentilc percentile percentile percentile percentile percentile 0.01 0.223 0.310 0. 315 0. 319 0.268 0.283 0.02 0. 302 0. 358 0.360 0. 3C4 0.318 0.329 0. 05 0.409 0.442 0. 445 0. 446 0.406 0. 412 0. 10 0.518 0.530 0. 535 0.535 0.491 0.504 0.25 0. 718 0. 713 0.728 0.732 0. 697 0.707 0. 40 0. 88G 0.873 0. 921 0.917 0.891 0.899 0. 50 1.00 0. 984 1.00 1.05 1. 04 1.04 0. GO 1. 13 1. 11 1.22 1. 22 1. 21 1.21 0.75 1.37 1.35 1. 58 1. 50 1.55 1.57 0. 90 1.81 1. 81 2.34 2.29 2.30 2.33 0.95 2. 10 2. 15 3.00 2. 93 2.96 3. 10 0.98 2. 65 2.02 4.08 3.97 4. 18 4. 10 0. 99 3. 04 3. 00 5.08 4. 92 5. 14 5. 10 n=15, r=] 15, P=0.99 ,i = 8, r = 4, P=0.95 ,i = 8, r = 8, P=0.95 vi = 40.4 V2 = 44.1 vi = 23.0 .-2 = 7.0 .-1 = 9.9 V2 = 21.5 y M.C. Approximate M.C. Approximate M.C. Approximate percentile percentile percentile percentile percentile percentile 0. 01 0.472 0. 482 0. 284 0. 282 0. 059 0.233 0.02 0. 510 0. 525 0. 322 0.327 0. 147 0. 278 0.05 0. 588 0. 598 0.407 0. 410 0. 297 0. 303 0. 10 0. 005 0. 070 0. 498 0. 502 0. 414 0. 454 0. 25 0.808 0. 809 0. 700 0. 711 0. 052 0. 052 0. 40 0. 922 0. 922 0. 914 0. 915 0. 849 0. 831 0. 50 0. 990 0. 998 1.07 1. 07 0. 971 0.958 0.00 1.07 1.08 1.20 1. 20 1. 12 1. 10 0. 75 1. 23 1.23 1.71 1.07 1.40 1.40 0. 90 1. 49 1. 49 2. ()3 2. 59 1. 95 1. 93 0. 95 1. 08 1. 07 3. 55 3. 40 2.37 2.30 0. 98 1.93 1. 90 5. 43 4. 90 2. 99 2. 95 0. 99 2. 10 2. 07 7.20 0. 44 3. 01 3. 42 WEIBULL TOLERANCE INTERVALS FOR COMBAT THEORY 127 V. EXAMPLE An example of simulated-battle data is given in Table 1 of Grubbs and Shuford [3], and a description of the simulated battle is given preceding their table. For each of the two sides in the engagement, the number of key targets (tanks, in this case) is 20. The data consist of times-to- incapacitation in minutes for four CBT's (chief battle tanks) and for five RlO tanks on the opposing side. Extensions of tables of Mann [8] have been used to provide estimates 574.20= 5,827 and ^4. 20= 1-002 from the RlO data. To use (7) to obtain a confidenceintervalon^pfor the RlO tanks for a specified P of, sa}^, 0.95, one needs to calculate In [—In (1—P)] = 1.0972 and to look up tabulated values from which values of ^5 20, Bs^zo, a.nd C5.20 can be determined in the report that provides the coefficients (or weights) for calculating the estimates of r; and ^. The tabulated values from which the necessary constants can be calculated are E,,,^ {LV)=E,,,, [{v-nYVe, £"5,20 {LB)=E,,2o [(?-^)']/^^ and £'5,20 (CP) =E[(ri — ri) (^— ^)]/$^ To determine ^5,20. -S5.20, and 6^5,20 one needs to know the relationships: a,,o=E,,2o(LB)/[l-E,,,o{LB)] B,,2o=E,,,o{CP)/[l-E,,,o{LB)] and As.2o=E,.,oiLU) + [E,.2oiCP)f/[l-E,,,o{LB)]. From the tables and these relationships, we find ^5.20=0.70308, ^5.20=0.33548, and (75,20=0.23662. (For n=25(5)60, ^'=^'^=0.10(0.10)1.00, values oi Ar.„, Br.n, and C, „ (or /,. „) are tabulated directly by Mann, Schafer, and Singpurwalla [14], p. 252 and p. 244.) Next, from (4) and (5), we find for the RlO data, »'i = 2(1.4178+ 1.0972)7(0.2274) = 55.62 and i'2 = 2/0.23662 = 8.45, correct to two decimal places. Note that j'2>8.0 and ^i>0.3(i'2) +20 = 22.53, so we might expect the i^-approximation to give very nearly the correct percentile of the distribution of (3) over the percentile range of 0.01 through 0.99. Using the method of Mann and Grubbs [13] to evaluate, say, the 10th percentile of P(55.62, 8.45), we obtain 0.567. Then a 90 percent lower confidence bound on U. 95 is, from (6) , given by exp [5.040- 1.4178(1.002) +0.567(1.4178 + 1.0972) (1.23662) (1.002)1 = 218 minutes. If a lower confidence bound on b is desired, as on page 938 of Grubbs and Shuford [3], then the specified value of P is 1-exp (-1) = 0.63 and ln[-ln(l-P)] = so that ^i = 2(1.4178)7(0.2274) = 17.68 and 1-2 = 8.45, as before. The value of Poos (17.68, 8.45) is approximately 0.408 so that a 95 percent lower confidence bound on 5= exp (»;) is therefore given by exp [5.040- 1.4178(1.002) + .408(1.4178) (1.23662) (1.002)1 = 76.4 minutes. This agrees well with the lower 95 percent confidence bound 74.4 minutes shown in the example of Grubbs and Shuford. It was calculated as exp (ij— Wo. 95?), where Wo. 95 is a percentile of the distribu- tion of {n—n)ll computed by Monte Carlo simulation procedures and published in the report by Mann, Fertig, and Scheuer, with tabulations supplemental to those of Mann and Fertig [11]. A 128 N. R. MANN lower confidence bound on 6, incidentally, will correspond to a lower confidence bound on mean- time-to-incapacitation for /3=^=1, since the exponential distribution is a special case of the WeibuU with shape parameter equal to one. REFERENCES Billman, B. R., C. L. An tie, and L. J. Bain, "Statistical Inference from Censored Weibull Samples," Technometrics U, 831-840 (1971). Engelhardt, M., and L. J. Bain, "Some Complete and Censored Results for the Weibull or Extreme-Value Distribution," Technometrics 15, 541-549 (1973). Grubbs, F. E., and J. H. Shuford, "A New Formulation of Lanchester Combat Theory," Operations Research 21, 926-941 (1973). Johns, M. v., Jr. and G. J. Lieberman, "An Exact Asymptotically Efficient Confidence Bound for ReliabiHty in the Case of the Weibull Distribution," Technometrics 8, 135-175 (1966). Lawless, J. F., "Construction of Tolerance Bounds for the Extreme-Value and Weibull Distri- butions," Technometrics 17, 255-261 (1975). Lawless, J. F. and N. R. Mann, "Tests for Homogeneity of Extreme-Value Scale Parameters," Communications in Statistics 5, 389-405 (1976). Mann, N. R., "Results on Location and Scale Parameter Estimation with Application to the Extreme-Value Distribution," Aerospace Research Laboratories Report ARL 67-0023, Office of Aerospace Research, U.S. Air Force, Wright-Patterson Air Force Base, Ohio (1967). Mann, N. R., "Tables for Obtaining the Best Linear Invariant Estimates of the Parameters of the Weibull Distribution," Technometrics 9, 629-645 (1967). Mann, N. R., "Warranty Periods for Production Lots Based on Fatigue-Test Data," Engi- neering Fracture Mechanics 8, 123-130 (1975). Mann, N. R., "An F Approximation for Two-Parameter Weibull and Lognormal Tolerance Bounds Based on Possibly Censored Data," Naval Research Logistics Quarterly, 24, 187-196 (1977). Mann, N. R. and K. W. Fertig, "Tables for Obtaining Weibull Confidence Boimds and Tolerance Bounds Based on Best Linear Invariant Estimates of the Extreme-Value Distri- bution," Technometrics 15, 87-101 (1973). Mann, N. R. and K. W. Fertig, "Simplified Efficient Point and Interval Estimators for Weibull Parameters," Technometrics 17, 361-378 (1975). Mann, N. R. and F. E. Grubbs, "Chi-Square Approximations for Exponential Parameters, Prediction Intervals and Beta Percentiles," Journal of the American Statistical Association 69, 654-661 (1974). Mann, N. R., R. E. Schafer, and N. D. Singpurwalla, "Methods for Statistical Analysis of Reliability and Life Data," (John Wiley, New York, 1974). Patnaik, P. B., "The Non-Central x^ and F Distributions and Their Applications," Biometrika 36, 202-232 (1949). Thoman, D. R., L. J. Bain, and C. E. Antle, "Inferences on the Parameters of the Weibull Distribution," Technometrics 11, 445-460 (1969). Thoman, D. R., L. J. Bain, and C. E. Antle, "Rehability and Tolerance Limits in the Weibull Distribution," Technometrics 12, 363-371 (1970). NUMERICAL INVESTIGATIONS ON QUADRATIC ASSIGNMENT PROBLEMS Rainer E. Burkard and Karl-Heinz Stratmann University of Cologne Federal Republic of Germany ABSTRACT This paper contains a comparative study of the numerical behavior of different algorithms for solving quadratic assignment problems. After the formulation of the problem, branch and bound algorithms are briefly discussed. Then, starting pro- cedures are described and compared by means of numerical results. Modifications of branch and bound procedures for obtaining good suboptimal solutions are treated in the next section. Subsequently, numerical results with the Gaschtitz-Ahrens algorithm are reported. In the last section, exchange algorithms are discussed, and it is pointed out how they can be combined efficiently with the Gaschiitz-Ahrens procedure and the perturbation method. All suboptimal solutions found in the literature could be improved by these combined methods. In the appendix, test examples and the best known solutions are listed. 1. INTRODUCTION Quadratic assignment problems (QAP) are combinatorial optimization problems of the follow- ing form: Find a permutation tp of the set A^={1, 2, . . ., n} such that becomes minimal. Let S^ be the set of all permutations of the set A^. Then a QAP can be formulated briefly as (1) min X) S di'fi(i)Pf(p) In most applications the cost coefficients dijpq have a special form: (2) dtjp,=aip bj, for i, j, p,q^N. QAP, for which (2) holds, are called "Koopmans-Beckmann problems." A Koopmans-Beckmann problem is said to be symmetric if at least one of the matrices A=iaip) or B^ih,^ is symmetric. Sometimes it is useful to minimize a bottleneck objective instead of a sum: (3) min max c?ip(op^(p). V^Sn i,P(:N 129 130 R- E. BURKARD & K. STRATMANN These so-called "quadratic bottleneck problems" (QBP) are numerically easier to solve than problems with a sum objective. Tests with branch and bound methods showed that QAP up to about n=15 and QBP up to about n = 20 can be solved optimally in reasonable time ([5], Table 2). QAP have many applications in different fields. For example, the following problems were treated by QAP in the last years : • backboard wiring problems • plant location problems • layout of hospitals [21] • development of new typewriter keyboards [29, 10]. Most of the real world problems lead to dimensions n>20 and can therefore not be solved optimally. Hence, suboptimal algorithms play an important role. Many suboptimal algorithms have been published (see, for example the surveys in Burkard [7] or Miiller-Merbach [24]), but until now these algorithms were hardly tested numerically. Parker [27] compares some heuristic methods for QAP to obtain acceptable solutions in short time and with small computational expense. This paper, however, emphasizes suboptimal procedures which yield near optimal solutions in acceptable time. Two methods proved to be favorable: the method of Gaschiitz-Ahrens and a modification of the perturbation method, both in connection with an improvement algorithm. All published suboptimal solutions could be improved by these two methods (see appendix). All computer pro- grams which are referred to in this paper were written for nonsymmetric problems, which arise, for example, in typewriter keyboard optimization. The CPU times of computer programs which take advantage of symmetry can be considerably smaller. 2. BRANCH AND BOUND METHODS FOR SOLVING QAP Several authors (e.g. Gilmore [15], Lawler [22], Conrad [11], Burkard [3,4], Heider [18]) proposed implicit enumeration algorithms for solving QAP. Pierce and Crowston [28] survey these methods without, however, referring to numerical results. Since the perturbation method yields the best numerical results, we give here a short description of it for Koopmans-Beckmann problems. Anal- ogously, general QAP and QBP are solved by this method. Let M(^N={1, 2, . . .,n} and let tp ^ Sjf he a permutation of N. We denote the restriction of <p to M as partial permutation (^, M). Moreover, let <I>m'- = {<p ^ Sff\<p\ m={>p, M)} be the set of all permutations which are extensions of the partial permutation ((p, M). Further, we define ^(M) : = {<p{i)\i^M}<^N. Let a Koopmans-Beckmann problem be given by the two (nXn) -matrices A and B. Further, let a partial permutation ((p, M) be given. The following objective value belongs to a <p ^ 0j,/: The first sum in (4) can computed exactly by the partial permutation {(p, M). For the second and third sum in (4) , a lower bound can be derived by solving a linear assignment problem (LAP) with a cost matrix C={Ci,) {% ^ M, j^^(M)). Moreover, the optimal solution of this LAP gives hints as to which assignment (i—^j) should be made in the next step. The coefficients c^ are chosen as lower bounds for the contribution of the assignment i toj to the objective value. QUADRATIC ASSIGNMENT PROBLEMS 131 First of all, one can put Then one determines ("reduction step") (6) 6; : = min 6,,; bj, : = 6,,- 6, U,g^'P (M) ,j^q) bj : = min b,j ; b,j:= b,j - ^, ij, q^<p(M), J9^q). gi'P(M) It is easy to show that is a further contribution which appears in z{(p) if the assignment {i~^j) is made. Due to an idea of Gilmore, the bound can be sharpened as follows. Let us order the elements of a vector a in increasing and of a vector b in decreasing order. Now let us denote the scalar product of these two ordered vectors as (a, 6). (a, b) is a lower bound for all scalar products n i=l with ip ^ Sn- Now we apply this to the data atp (i, p^M, i9^p) and the reduced elements b^g (j, g^<p (M), J9^q). If we define ci'i' = {(iip\p^M,p9^i), and bj: = {bj,\ql<p{M),q9^j),ilM,Ji^{M), then (at, bj) is a further contribution to the objective value zi^p), if the assignment {i — *j) appears in the permutation (p ("Gilmore-bound"). Thus the coefficients dj can be determined in the following way: =a«&j;+X) {(f'ipbi^(,p)+apib^^p)j)-i-/bj X) <lip^-bj X) «pA+(ai, b,) p^M I p^M p£M J \ p^i p^i ^ In the case of symmetric Koopmans-Beckmann problems, the computation of the coefficients Cij can be simplified. A bound for the corresponding QBP is obtained by replacing every sum sign in (7) by "maxi- mum". The coefficients c<^ are then interpreted as cost coefficients of a linear bottleneck problem (LBP). For general QAP [QBP], the contribution (a^, b^ has to be replaced by the objective value of a further LAP [LBP]. Numerical results showed that the occurring linear problems should be solved optimally. For solving LAP on algorithm of Dorhout [12] was used, whereas for solving LBP the code of Pape and Schon [26] was applied. A comparison of methods for linear problems can be found in Burkard [5]. 132 R. E. BURKARD & K. STRATMANN Now let 2£, be the objective value and (Cij) the optimal tableau of the LAP with cost co- efficients Cj. A lower bound for the objective value (4) of the permutations (p^<I>m is (8) 3(</>m):= S «.p6(?«)*>(p)+2i. i, p^M The branching in the perturbation method is done as follows : Start with the partial permutation with empty M. Let z be an upper bound for the optimal objective value. If 2(0ji/)<2 choose a pair (i, j) of indices, i^M, j^<p{M), by the rule of alternative costs, replace M by MU {i}, and extend the partial permutation by the assignment (i — *j). Then a new bound z{<I>m) is computed. The new coefficients dj can be gained from the last considered tableau C=(cij). If \M\=n—2, the objective value of the two corresponding permutations is computed exactly and eventually the bound z is improved. If z{4>m)^z, one returns to the last found partial permutation (<p, M) with 2(<^j^X2. The assignment {i — >j), chosen at the transition of {tp, M) to its successor, is now blocked. If the blocked element is the first blocked one in the matrix C= (ctj) corresponding to {<p, M) , a new assignment is chosen in the same row or column of C. If, however, row i (column ji) contains more blocked elements, the new assignment has to be chosen in the same row (column). If this is not possible, one has to return to the predecessor of the partial permutation {ip, M). The optimal solution has the value z as soon as an empty M is reached. The used rule of alternative costs runs as follows: Let (p* be an optimal solution of the LAP with cost coefficients c,j and let (?<;) be the corresponding optimal tableau. Compute now for every (^, <P*{i)) (9) «<:= min c<^ /3<: = min Cp^Vo i V The value ai+/3i gives a lower bound for the increase of the bound Zl, if the assignment {i — ><^*(i)) would not be used ("alternative costs")- Favorably, one of those pairs (t, (p*{i)) is chosen for the next assignment which has maximal alternative costs. Therefore max (a<+)3() (for bottleneck problems max max («<, /Si)) is determined. If the maximum is taken in the index i, define the following new partial permutation {f, M) : M:=M[j{i} ^(t):=^*(i) ^ ip) : =(p ip) for all p^M. Table 1 shows the influence of the different bounding and branching rules on the running time. In version 1 the given data were reduced and then we put (^ij -^^Cij ~r2—i \^ip'^i<p(.v)\0'vi">pip)j)- Branching was done according to alternative costs. In version 2 the Gilmore bounds were computed without preceding reduction, and branching was done according to the max-min rule [15]. Version 3 (perturbation method) combines reduction step and Gilmore bounds and uses the alternative costs for branching. QUADRATIC ASSIGNMENT PROBLEMS 133 Table l. Average Times for Solving 20 Koopmans-Beckmann Problems of Dimension n=8 with Different Branch and Bound Methods on CDC CYBER 72/76. The Data at,,, bj, were Uniformly Distributed in the Interval [0, 6S\. Average time for the solution Average number of solved linear problems Version 1 3.373 sec 9964 Version 2 0.923 sec 1093 Version 3 0.286 sec 283 Table 2 shows the large numerical differences between QAP and QBP: Table 2. Comparison of QAP and QBP by means of the Nugent Examples [25]. Example Dimension QAP" QBP- OV Time [sec] #LAP OV Time [sec] #LBP Nugent 1 5 50 0.006 8 5 0. 008 12 2 6 86 0.020 25 10 0. 004 1 3 7 148 0.032 28 10 0.007 8 4 8 214 0. 186 189 10 0. 019 24 5 12 578 29. 325 15575 15 1. 608 2312 6 15 — — — 15 0.880 723 7 20 — — — 20 35. 276 31941 " The perturbation method (Version 3) was used for solving QAP, whereas the QBP were solved without Gilmore-bounds. Running times in seconds on CDC CYBER 72/76. OV: optimal value of the objective, #LAP [LBP]: number of solved LAP [LBP]. Numerical tests for QBP showed that it is more favorable to solve these problems without computing the Gilmore bounds for the occurring linear problems [13]. Table 2 shows clearly that bottleneck problems are considerably easier to solve than sum problems. This can be explained by an algebraic theory [6]. Moreover, Table 2 shows that QAP can be solved optimally up to the dimen- sion 15 and QBP up to the dimension 20 in reasonable time. Since real -world problems have nearly always larger dimensions, we treat in the following sections suboptimal methods. 3. STARTING PROCEDURES Suboptimal methods can be divided into two large classes: construction methods, which yield a first feasible solution, and improvement methods, by which already found solutions can be 134 R- E. BURKARD & K. STRATMANN improved. Besides the start phase of branch and bound methods there are a large number of heuristic approaches which can be used as construction methods. Some of them are 3.1 At—B, RULE [24]: This is an improved version of Parker's "Best Match" Rule [27]. Let {<p, M) be a partial permutation. Set and assign an index i with minimal value Ai to an index j with maximal value B,. Then replace M by MU [i] and compute the new values Ai, By 3.2 RULE OF THE NEAREST NEIGHBOUR: The first step is performed as in 3.1. If finally an ^ was assigned to an index jo, then assign in the next step an i with minimal (aj(o+a,pi) to an index j with maximal {bjjQ-\-hj^j). 3.3 PAIRWISE ASSIGNMENT: At first a,-„p„=max {aip|r,p=l, 2, . . .,n} 6;off„=min {6^J j, §=1, 2, . . ., n] are computed and the pair {i'o, po} is assigned to [jo, go}- Then ai,,p,=max {aip|(i, p) 5^(^■o, Po)} is determined. If {io, po} {ii,Pi} = 0, compute 6j,,g,=min{6jj|i5^jo, q.^q.o] and assign {iitfi} to {ji, gi}. Otherwise, determine 6;,,ff.=min {min {6j,|jVjo, 2=?o}, min {6^,|i=io, 2?^2o}} and assign t*^ {-io, Po} fl {^'i, 2^1} to j* ^ {jo, go} D {ji, gi}. This leads to two further assignments for the elements {io, Po, ii, Pi]\{i*}. If i^ is already assigned to j^, then only those pairs of indices (j^, g) are admitted in the deter- mination of min bjg, for which g is not yet assigned. If both elements i and p are in already examined pairs, for example (i, io) and (p, po), and if (i, Jo), (q, 2o) are assigned to these pairs, then determine m'm{br,\r^{j,jo},s^{q,qo}} 3.4 METHODS WITH INCREASING DEGREE OF FREEDOM [24] : Start with empty M. Now choose an i^M and put M: = M[j {i}, k: = \M\. Assign an index j^ {1, 2, . . ., n} to this i, such that (10) 2-1 C'lpb^iDvip) i, p£M becomes minimal. To determine the free index j^,p(M) the sum (10) has to be computed k(n—k+l) times. If i is assigned to a jo^<piM), then the index io of the formerly found assignment (^ — >jo) has to be reassigned to a free index j ^ .^ (M) . This method can be performed successively with different QUADRATIC ASSIGNMENT PROBLEMS 135 choice of the free indices i^M and by this the result can often be improved considerably. We performed this method n/3 times successively. See Table 3 (71= dimension of the problem). To compare the efficiency of the described methods, these were tested by numerous examples. Table 3 shows some results for selected published test examples, namely, for the Nugent examples with n=12, 15, 20, 30, for the test example of Armour-Buffa (w = 20) with direct (D.D.) and rectangular distance (R.D.), as well as for the example of Steinberg (n=36) with D.D., R.D., and squared direct distance (S.D.D.). Further test examples, data and comparisons of starting procedures can be found in Stratmann [31] and Parker [27]. From the numerical results the following conclusions can be drawn: methods 3.1-3.3 yield the worst results, yet they have the least running time. The method with increasing degree of freedom yields favorable results if it is applied several times successively with different starting values. Moreover, it can be programmed simply and with core-saving; therefore, it is recommendable for rough layouts. Branch and bound methods need a more central memory core and more programming expense even in the start phase; they are, however, superior to all other methods with respect to the quality of the suboptimal solution. 4. MODIFIED BRANCH AND BOUND METHODS Branch and bound methods can be modified in different respects to get good suboptimal solutions instead of optimal solutions. Many examples show that the optimal solution is found quickly by these methods, yet a multiple of time is needed to prove the optimality. The method of time limit takes advantage of this experience by stopping the implicit enumeration, if after a Table 3. Comparison of Starting Procedures. OV: Value of the objective, time in seconds on CDC CYBER 72/76. The values of method S.I^. belong to a n/S-times performance of this method. Method 3.1 3.2 3.3 3.4 Examples OV Time OV Time OV Time OV Time Nugent 71=12 686 0. 001 642 0. 001 706 0.003 624 0.057 Nugent n=15 1326 0.001 1190 0.002 1278 0.007 1206 0. 166 Nugent n = 20 3026 0. 003 3046 0.003 2904 0. 018 2790 0.596 Nugent 71 = 30 7262 0. 005 7502 0. 006 7190 0. 080 6596 4. 777 Armour-Buffa D.D. 3620. 075 0. 002 3151. 431 0.002 3972. 586 0.026 2646. 684 0. 597 Armour-Buffa R.D. 4074. 165 0. 003 3875. 335 0.002 3899. 965 0.027 3379. 615 0.597 Steinberg R.D. 7419 0.008 6963 0.008 8669 0. 141 5883 11.671 Steinberg D.D. 6112. 74 0. 007 5928. 22 0.008 7569. 34 0. 141 4940. 03 11.680 Steinberg S.D.D. 19531 0.007 20893 0. 008 33715 0. 138 11468 11. 671 136 R. E. BURKARD & K. STRATMANN Table 3. Comparison oj Starting Procedures. OV: Value oj the objective, time in seconds on CDC CYBER 72/76. The values of method 3.4 belong to a n/3-times performance of this method. — Con. Method BB, Version 1 BB, Version 2 BB, Version 3 Examples OV Time OV Time OV Time Nugent n=12 634 0. 019 596 0. 033 604 0.032 Nugent n=15 1208 0. 037 1350 0. 067 1150 0.066 Nugent 7! = 20 2812 0. 104 2836 0. 181 2682 0. 176 Nugent n = 30 6522 0. 457 6918 0. 756 6388 0. 706 Armour- Buffa D.D. 2875. 632 0.093 3076. 559 0. 175 2767. 658 0. 180 Armour-Buffa R.D. 3246. 25 0.094 3569. 01 0. 174 3540. 12 0. 189 Steinberg R.D. 5755 0.647 5791 1.412 5805 1. 53 Steinberg D.D. 4956. 32 0.852 4542. 06 1.400 4921. 72 1.565 Steinberg S.D.D. 13753 0.849 10656 1.444 13119 1.552 given time no improvement can be found. This method can easily be combined with the method of decreasing demands: whenever after a given time no improvement can be made, the upper bound will be decreased. By this process the solution tree is cut stronger, but one takes into account that the optimal solutions are cut off. Table 4 shows some results found with the method of decreas ing demands. Since the lower bounds of a partial permutation (<p, M) depend strongly on the level k:=\M\, one can try to accelerate the search process by modifying the bounds according to the level k. Table 4. Test Examples Solved By the Method of Decreasing Demands. Example a b c d e / Nugent 15 1150 3.38 1.5 1130 47.96 15 1150 3.38 2.0 1130 37. 13 Nugent 20 2600 8. 00 1.5 2658 121. 10 20 2600 8. 00 2.0 2658 90.46 Nugent 30 6186 27.00 1. 5 6302 525. 24 30 6186 27.00 2.0 6302 392. 34 Armour-Bufifa D.D. 20 2853. 893 8. 00 2.0 2530. 151 110. 58 Armour-Bufifa R.D. 20 3165. 951 8.00 2.0 3036. 705 110. 50 Steinberg D.D. 36 4138. 72 46. 66 2.0 4280. 60 1042. 36 Steinberg S.D.D. 36 7946 40. 66 2.0 10242 1123.24 Steinberg R.D. 36 4829 46.66 2. 5791 1122. 28 a = dimension, b = best published objective value, c = time interval in seconds, after which the upper bound is decreased, d = percentage, by which the upper bound is decreased, e = obtained objective value, /=required time in seconds on CDC CYBER 72/76. QUADRATIC ASSIGNMENT PROBLEMS 137 Numerical results show [31] that the bounds increase quadra tically with k for small values of k and linearly for larger values of k. If the parameters of this behavior are determined by test runs, onlj^ those branches can be considered which show a larger deviation from the normal behavior of the respective example. Details of this method can be found in Stratmann [31]. It shows features similar to the statistical approach of Graves and Whinston [16]. Table 5 shows some results obtained with modified branch and bound algorithms. Table 5. Comparison oj Results Obtained By Modified Branch and Bound Methods. Example-Dimension a b c d Nugent 15 1150 1126 1130 1150 382. 203 37. 130 60. 360 Nugent 20 2600 2658 2658 2664 251. 236 90. 456 250. 100 Nugent 30 6186 6302 6302 6350 260. 571 392. 354 250. 702 Armour- BuflFa D.D. 20 2853. 893 2466. 817 2530. 151 2397. 532 629. 527 110. 576 306. 741 Armour-Buffa R.D. 20 3165. 951 2996. 600 3036. 705 2905. 540 261. 306 109. 987 276. 874 Steinberg D.D. 36 4138. 72 4280. 60 4280. 60 4294. 71 351. 743 1042. 361 351. 460 Steinberg S.D.D. 36 7946 10242 10242 10476 353. 247 1123. 240 351. 500 Steinberg R.D. 36 4829 5791 5791 5805 352. 941 1122.878 351. 496 a = best published objective value, 6 = objective value obtained by the method with time limit, under that running time, c = objective value obtained by the method with decreasing demands, under that running time, d = objective value obtained by the method with variable bounds, under that running time. All running times are in seconds on a CDC CYBER 72/76. 5. THE METHOD OF GASCHUTZ AND AHRENS Gaschiitz and Ahrens [14] developed a heuristical method based on geometric ideas for solving Koopmans-Beckmann problems with matrices A and B. They suppose that A is symmetric and has only zero elements in its main diagonal. The element a^p of the matrix A can be interpreted as distance between the places i and p. The elements of B are regarded as units of goods which have to be transported between plant j and plant q. We conceive the plants j, q, ... as vertices of a graph. These vertices are linked by edges with the valuation bj,, if bj^^^O. Moreover, we suppose that the places i, p, . . . lie in a rectangle. Many real-world problems satisfy these assumptions, e.g. the problems of Nugent and Steinberg [25, 30]. Now an assignment (p of the plants to the places is wanted such that the objective function i=l p = l becomes minimal. 138 R. E. BURKARD & K. STRATMANN Let us start with an arbitrary permutation. Zj are to be the coordinates of plant j for this permutation in the geometrical representation in the given rectangle. Now we transform these coordinates such that plants with large exchange of goods lie neighbored. The transformations have the form (11) with ^w^=''-/;gMWA,§*"^' 17^ i Q^J and a free parameter /. A favorable choice of / is J^N /■■- '2tl where t is the mean value of the elements bjj(j=l, . . ., n; q=l, . . ., n) and I is the number of elements j^O in the matrix B. An interpretation of these transformations is given by Gaschlitz and Ahrens [14] or Stratmann [31]. These transformations are performed several times successively, in the course of which the parameter/ should tend to zero. The transformations (11) gather the vertices of the graph to clusters in the interior of the rectangle; therefore, the image of the graph has to be extended by "regulating transformations" such that the rectangle is filled out. Thereafter the plants have to be assigned to new places. Gaschlitz and Ahrens find this assignment by a procedure "GRID", yet we propose to solve a LAP. It is favorable to take the squared direct distances as cost coefficients of the LAP. The assignment found in this way is now the starting permutation for a new iteration, i.e. the performance of transformations (11), of the regulating transformations, and of the solution of a LAP. The time expense of an iteration is low. Table 6 shows the average times depending on the dimension. Table 6. Average Running Times per Iteration for the Method of Gaschiitz and Ahrens on CDC CYBER 72/76. Dimension Time (sec) 15 20 30 36 0.021 0.045 0. 145 0. 198 The following suggestions for the performance of this method are due to numerous test runs: • The number of iterations can be restricted to 2-3. A larger number did not improve the solution. • The number of transformations (11) should equal the dimension n of the problem. • With increasing n the regulating transformations should be performed in always shorter intervals. QUADRATIC ASSIGNMENT PROBLEMS 139 • For n=20 it is enough to apply a regulating transformation once or twice after every third until fifth transformation, but for n>30 a regulating transformation should be applied after every transformation (11). • GRID was compared with LAP, the cost coefficients of which were defined by direct, rec- tangular, and squared direct distances. There were no large differences in the results. Table 7 shows the best results found with the Gaschiitz-Ahrens method during the test runs. Table 7. Best Results Obtained in the Test Runs for the Gaschiitz-Ahrens Method. Example Dimension Objective Value Nugent Nugent Nugent Steinberg D.D. Steinberg Q.D.D. Steinberg R.D. 15 20 30 36 36 36 1 15 3 2 3 20 3 1 1 30 5 1 1 36 1 2 1 36 1 1 1 36 1 1 R.D., D.D., S.D.D. GRID S.D.D. D.D. D.D. GRID 1156 2686 6378 4384. 93 8984 5452 a=result found in the ath iteration, 6 = number of iterations of form (11), c = regulating transformations applied after every cth transformation of the form (11), d = number of regulating transformation respectively performed, e = method by which the last assignment was determined. A comparison of the results in Table 7 with the best known results shows that the Gaschiitz- Ahrens method alone finds only mediocre solutions. The advantages of this method are that it can be started with an arbitrary permutation and that the result can be improved by exchange methods (see next section). By the arbitrary start permutation, many independent trials can be performed which yield together a very good solution. Another geometrically oriented approach was applied by Krarup [21] to determine hospital layouts. 6. IMPROVEMENT METHODS Knowing a suboptimal solution <p, one can try to improve the objective value by exchanging two assignments. Starting with <p, a new permutation (p* is defined by ,p*{i): = ,p{p) <p*{'p): = >p{i) iorsWk^NXii,^] Totally, there are (J) possibilities for pair exchanges. Checking all these exchanges is called an iteration. Several procedures were proposed for performing the iterations. 6.1 METHOD OF FIRST IMPROVEMENT: In Parker's [27] notation this method is called "slow restart with no order." In fixed order the pairwise exchanges are examined until an improve- ment of the objective is found. Carry out this exchange and start a new iteration. 6.2 CRAFT [1]: Examine all possible pair exchanges and carry out that one which yields the largest improvement. Then start with a new iteration. 140 R. E. BURKARD & K. STRATMANN 6.3 METHOD OF HEWER [17]: Examine the pairwise exchanges in fixed order and carry out the exchange as soon as an improvement is found. Therefore it is possible that there are several improvements in one iteration. All methods are stopped, if an iteration does not yield an improvement. It is not necessary to compute the two objective values, if <p and <p* differ only in the images of i and p. We compute for M:=N\{i, p}: (12) 5:=^ [{(iik~0'pk){b,(>ii)^(k) — o^(p)^(t)) + (a^i (ikp){f>v{k)<fH) "*>(*)«>(?))] -\-{au—app){b^ (<)«'<<) '(p(.p)f(p) J'viO'ip <Jfp<) (0(p(4)^(P) ^ 'P(p)<m) ). If 5>0, then <p* yields a lower objective value than <p. To compute 5, only (2n— 2) multiplications and (5n— 4) additions are needed. In the methods 6.1 and 6.3, the result depends on the order in which the exchanges are exam- ined. One possibility to take advantage of is to fix, let us say, 10 different arbitrary orders in which the exchanges should be examined. Another possibility is to find an order which promises a large improvement. Considerations in this direction were made by Khalil [20], Parker [27], and Stratmann [31] (see Section 7). Table 8 shows some data for the pair exchange algorithms 6.1-6.3. Table 8. Comparison of Pair Exchange Algorithms 6.1-6.3. Example Dimension Start solution Method 6.1 Method 6.2 Method 6.3 Nugent 6 15 1252 1326 1154 6/0. 021 1140 14/0. 039 1154 6/0. 042 1172 5/0. 034 1146 3/0. 042 1142 3/0. 028 Nugent 7 20 2750 3026 2640 11/0.088 2644 19/0. 086 2668 8/0. 130 2684 7/0. 115 2674 3/0. 058 2634 5/0. 103 Nugent 8 30 6710 7242 6222 29/0. 438 6220 54/0. 802 6346 10/0. 535 6604 12/0. 633 6224 4/0. 258 6192 5/0. 327 The first start solution was determined by the method of increasing degree of freedom (3.4), the second by the A, — B,-Tu\e (3.1). The first number shows the found objective value, the second the number of iterations and the third the running time in seconds on CDC CYBER 72/76. For methods 6.1 and 6.3 the respective best results out of 10 test runs with arbitrary order are shown. Altogether, the test runs show that a multiple performance of methods 6.1 and 6.3 yields better results than CRAFT and that for higher dimensions, Heider's method is faster than method 6.1. Analogous to pairwise exchange, one can try to improve the objective value by cyclic exchange of three assignments. In doing so, the expense of computations grow considerably. The following QUADRATIC ASSIGNMENT PROBLEMS 141 table shows the dependence of the needed time on the dimension of pair exchange and cycUc triple exchange algorithm on CDC CYBER 72/76. Table 9. Average Running Times in Seconds jor One Iteration of Pair and Cyclic Triple Exchange Algo- rithm on CDC CYBER 72/76. Dimension Pair exchange Triple exchange 15 20 30 36 0.008 0.019 0.062 0. 108 0.075 0. 256 1.394 2.931 It is not efficient to replace pair exchange algorithms by triple exchange algorithms, since they do not yield essential better results, but much more expense. Yet triple exchange algorithms can supplement pair exchange algorithms. 7. THE IMPROVEMENT PACKAGE VERBES Test runs with the above mentioned improvement methods showed • if the starting solution is only mediocre, then the performance of several runs of Heider's method (6.3) is the most time-sparing method to reach a good result. • if the starting solution is of high quality, then it is favorable not to use an arbitrary order of trial switches. In this case the order of exchanges is determined such that those elements are examined first which jdeld most probably an improvement. In the second case the order of exchanges is fixed in the following way : Fix an element i and order the sets N\{i} and N\{(p{i)] in sequences (?t)l </:<« — 1, {jr)l<r<n—l such that jr<js- (=^ bjj^+bj^j>bjj^+bj^j. Then define n-l gii):=J2 {n-v){n-n) with yL=ix{v) defined by Now define the following order on A^: f (iy) =i/i (13) i<i':^=^ g{i)<g{i'). When a pair exchange algorithm does not find an improvement anymore, then a triple exchange algorithm can still yield a better solution. This leads to the following improvement package desig- nated VERBES: (a) Apply n/3 times the Heider algorithm with arbitrary order of the exchanges. (b) Store the three best solutions found in (a). 142 R. E. BURKARD & K. STRATMANN (c) Apply a triple exchange algorithm with fixed order of exchanges to every one of the solutions stored in (b). The order of exchanges is defined by (13). (d) If an improvement can be achieved in (c), repeat alternately the Heider and the triple exchange algorithm with fixed order, defined by (13), until both do not yield an improve- ment. The used triple exchange algorithm has the same philosophy as 6.3 ("fast restart" [27]). Combination with suboptimal branch and bound methods yielded following results (Table 10) withVERBES: Table 10. Suboptimal Branch and Bound Methods Combined with VERBES. Example rf/o Time Limit Decreasing Demands Variable Bounds Nugent 6 15 1150 1126 1126 382. 203 0. 115 1130 1130 37. 130 0. 114 1216 1130 3.989 0.423 Nugent 7 20 2600 2658 2658 251. 236 0.361 2658 2658 90. 456 0.367 2664 2664 250. 189 0.367 Nugent 8 30 6186 6302 6200 360. 571 3.988 6302 6200 392. 354 4. 190 6350 6196 250. 761 7.448 Armour-Buffa D.D. 20 2853. 893 2466. 817 2466. 817 629. 527 0. 363 2530. 151 2471. 508 110. 576 1.951 2397. 532 2360. 468 306. 741 0.643 Armour-Buffa R.D. 20 3165. 951 2966. 600 2966. 600 261. 306 0.365 3036. 705 3036. 705 109. 987 0.367 2905. 540 2841. 170 276. 074 0.872 Steinberg D.D. 36 4138. 72 4280. 60 4253. 29 351. 964 6.470 4280. 60 4253. 29 1042. 361 6.468 4294. 71 4253. 29 351. 501 9.412 Steinberg S.D.D. 36 7946 10242 9436 353. 247 19. 458 10242 9288 1123. 240 30. 456 10476 8606 351. 503 40. 635 Steinberg R.D. 36 4829 5791 5033 352. 943 28. 123 5791 5033 1122.870 28. 126 5805 5115 351. 501 24. 154 d = dimension, a = best published objective value. Upper rows: Objective value and time for the branch and bound method, under that result after applying VERBES and needed time in sec. for VERBES on CDC CYBER 72/76. While VERBES can be applied to only one suboptimal solution found by branch and bound methods, many suboptimal solutions can be obtained by the Gaschiitz-Ahrens method. If VERBES is applied to all of them, better results (table II) are found throughout than those of Table 10. About 50-80 iterations were computed with the Gaschiitz-Ahrens procedure with different choice of the parameters. Altogether, the following times have to be taken into account on a CDC CYBER 72/76. Dimension Total running time 15 ~ 1 min. 20 ~' 3 min. 30 ~10 min. 36 ~20 min. QUADRATIC ASSIGNMENT PROBLEMS 143 Table 1 1 . Comparison of Solutions, Found With Gaschiitz-Ahrens Method, Combined With VERBES and Those oj Suboptimal Branch and Bound Procedures Example a b c d Nugent 6 15 1150 1126 T 1126 Nugent 7 20 2600 2658 T,D 2594 Nugent 8 30 6186 6196 B + V 6178 Armour-Buffa, D.D. 20 2853. 893 2360. 468 B+V — Armour-Buffa, R.D. 20 3165. 951 2811. 170 B+V — Steinberg, D.D. 36 4138. 72 4253. 29 T+V,D + V, B + V 4132. 97 Steinberg, S.D.D. 36 7946 8606 B+V 7926 Steinberg, R.D. 36 4829 5033 T+V,D+V 4804 Krarup 30 91730 91610 T+V 91420 a=dimension, 6 = best published objective value, c = best objective value found with branch and bound methods, T=method of time limit, Z) = method of decreasing demands, B=method of variable bounds, F= VERBES. (i = best objective value found with the Gaschiitz-Ahrens method combined with VERBES. The examples of Armour- Buffa can not be solved by the Gaschiitz-Ahrens procedure. Stimulated by the good results with the combination Gaschiitz-Ahrens and VERBES, we investigated the possibility that comparable results could be obtained by modified branch and bound methods in similar times. This would have the advantage of being universally applicable in con- trast to the first method. It turned out that the following combination of the perturbation method together with VERBES yields comparable results: (a) Determine a suboptimal solution with the perturbation method using a time limit (about 8-10 sec. for w = 30). (b) Improve this suboptimal solution by VERBES. (c) If an improved solution is found in (b), determine the smallest level ko in the branching process of (a) which was changed by the exchange algorithm. (d) Apply the perturbation method again, starting from this level ko- Verify that already examined solutions are not enumerated again. Return to (b). Table 12 shows the objective values obtained by this method. Running times as well as ob- jective values are of the same order as those of the Gaschiitz-Ahrens procedure with VERBES. The programming expense, however, is considerably larger for the above mentioned method than for the combination Gaschiitz-Ahrens with VERBES. The performed tests show that the Gaschiitz-Ahrens procedure combined with VERBES and the perturbation method combined with Vli^RBES produce very good results in acceptable time. The programming expense of the perturbation method combined with VERBES is rather high; 144 R. E. BURKARD & K. STRATMANN Table 12. Perturbation Method Combined With VERBES- Procedure Example a h c Nugent 6 15 1126 16. 725 Nugent 7 20 2574(*) 1 :25. 736 Nugent 8 30 6158(*) 6:50. 778 Steinberg D.D. 36 4132. 29 18: 9.442 Steinberg S.D.D. 36 8109 20:53. 785 Steinberg R.D. 36 4807 13:59.374 Krarup 30 90420(*) 6: 7. 192 a = dimension, 6 = obtained objective value, c = time in minutes and seconds on CDC CYBER 72/76 for 25 runs of the combination perturbation method-VERBES. A (*) denotes a new record result. this method, however, is a procedure for all purposes. The method of increasing degree of freedom yields good results combined with the improvement package VERBES at a smaller programming expense. APPENDIX Summary of test examples and the best known results Test example Nugent 1 [25] Optimal objective value: 50 Optimal solutions: (1, 4, 2) (3, 5), (1, 4, 2, 5, 3) Test example Nugent 2 [25] Optimal objective value : 86 Optimal solutions: (1) (2) (3) (4) (5) (6), (1, 3) (2) (4, 6) (5), (1, 4) (2, 5) (3, 6) Test example Nugent 3 [25] Optimal objective value : 148 Optimal solution: (1) (2) (3, 4, 5) (6, 7) Test example Nugent 4 [25] Optimal objective value: 214 Optimal solutions: (1, 2) (3, 4, 5) (6, 8) (7), (1, 3, 7, 4, 6) (2, 8, 5) Test example Nugent 5 [25] Dimension: 12 Best published objective value : 578 [25] Optimal objective value : 578 (perturbation method) Solution: (1, 5, 4, 2, 6, 8) (3, 10, 7, 11, 9, 12) Test example Nugent 6 [25] Dimension: 15 Best published objective value: 1150 [25, 20] New objective value: 1126 (Gaschiitz-Ahrens+VERBES, perturbation method) Solution: (1) (2, 11, 4, 12, 7, 8, 13, 3, 9, 14, 5, 10, 6) (15) QUADRATIC ASSIGNMENT PROBLEMS 145 Test example Nugent 7 [25] Dimension : 20 Best published objective value: 2600 [20] New objective value: 2574 (perturbation method +VERBES) Solution: (1, 17, 2, 7, 20, 9, 11, 19, 3, 5, 6, 4) (8) (10, 13, 12, 15, 16, 18, 14) Test example Nugent 8 [25] Dimension: 30 Best published objective value: 6186 [25] New objective value: 6158 (perturbation method +VERBES) Solution: (1, 5, 4, 3, 29, 17, 11, 30, 15, 7, 28) (2) (6, 14, 9, 16, 8, 21, 10, 19) (12, 20, 13, 25) (18, 27, 26, 24) (22) (23). Test example Steinberg D.E. [30] Dimension: 36 atj: = -yJ{Xi —XjY-\-{yi —yiY Best published objective value: 4138.72 [17] New objective value: 4124.97 (Gaschiitz-Ahrens+VERBES) Solution : 24 22 21 27 11 6 5 3 — 26 25 23 14 12 13 4 8 2 33 34 32 19 20 7 10 18 17 — 31 30 29 28 15 1 9 16 Test example Steinberg S.D.D. [30] Dimension: 36 aij: = {Xi—XjY-\-{yi—yjY Best published objective value: 7946 [14] New objective value: 7926 (Gaschutz-Ahrens+VERBES) Solution : 24 22 21 27 11 6 5 3 — 26 25 23 14 12 13 4 8 2 33 34 32 19 20 7 10 18 17 — 31 30 29 28 1 15 9 16 Text example Steinberg R.D. [30] Dimension: 36 aij:=\xi—x^\-\-\yi—yj\ Best published objective value: 4829 [14] New objective value: 4802 (perturbation method +VERBES) 146 R. E. BURKARD & K. STRATMANN Solution : — 16 9 15 29 30 31 34 33 17 18 10 7 1 28 19 32 26 2 8 4 13 12 20 23 21 25 — 3 5 6 11 14 27 22 24 Test example Krarup [21] Dimension: 30 Best published objective value: 91730 [21] New objective value: 90420 (perturbation method +VERBES) Solution : Using a different arrangement for the rooms, Krarup found meanwhile a solution with objec- tive value 89400 (private communication). BIBLIOGRAPHY [1] Armour, G. C, and E. S. Buffa, "A Heuristic Algorithm and Simulation Approach to Relative Location of Facilities," Management Science 9 294-309 (1963). [2] Buffa, E. S., G. C. Armour, and T. E.Vollmann, "Allocating Facilities with CRAFT," Harvard Business Review ^.g 136-158 (1964). [3] Burkard, R. E., "Die Storungsmethode zur Losung quadratischer Zuordnungsprobleme," Operations Research Verfahren 16 84-108 (1973). [4] Burkard, R. E., "Quadratische Bottleneckprobleme," Operations Research Verfahren 18 26-41 (1974). [5] Burkard, R. E., "Numerische Erfahrungen mit Summen und Bottleneckzuordnungsproble- men," in Numerische Methoden bei grapheniheoretischen und kombinatorischen Problemen ed. by L. Collatz, G. Meinardus and H. Werner, 29 9-25 (Birkhauser Verlag, Basel-Stuttgart ISNM1975). [6] Burkard, R. E., "Kombinatorische Optimierung in Halbgruppen," in Optimization and Optimal Control ed. by R. Bulirsch, W. Oettli, and J. Stoer, 477 2-17 (Springer LN Maths 1975). [7] Burkard, R. E., "Heuristische Verfahren zur Losung quadratischer Zuordnungsprobleme," Zeitschrift fur Operations Research 19 183-193 (1975). QUADRATIC ASSIGNMENT PROBLEMS 147 [8] Burkard, R. E., and A. Gerstl, "Quadratische Zuordnungsprobleme III: Testbeispiele und Rechenergebnisse," Report No. 85 Computer Centre Graz (1973). [9] Burkard, R. E., W. Hahn and U. Zimmermann, "An Algebraic Approach to Assignment Problems," Mathematical Programming, 318-327 (1977). [10] Burkard, R. E., and J. OfFermann, "Entwurf von Schreibmaschinentastaturen mittels qua- dratischer Zuordnungsprobleme," in Zeitschrift fur Operations Research, 21 B 21-32 (1977). [11] Conrad, K., Das quadratische Zuweisungsproblem und zu:ei seiner Spezialfdlle. (Tubingen: Mohr-Siebeck, 1971). [12] Dorhout, B., "Het Lineaire Toewijzingsproblem; Vergelijking van Algoritmen," Report BN 21, Mathematisch Centrum, Amsterdam (1973). [13] Franke, B., "Quadratische Bottleneck-Zuordnungs-probleme," Diplomarbeit, Math. Institut, Universitat Koln 1976). [14] Gaschiitz, G. K., and J. H. Ahrens, "Suboptimal Algorithms for the Quadratic Assignment Problem," Naval Research Logistics Quarterly 15 49-62 (1968). [15] Gilmore, P. C, "Optimal and Suboptimal Algorithms for the Quadratic Assignment Problem," Journal of the Society for Industrial and Applied Mathematics 10 305-313 (1962). [16] Graves, G. W., and A. B. Whinston, "An Algorithm for the Quadratic Assignment Problem," Management Science 16 453-471 (1970). [17] Heider, C. H., "A Computationally SimpUfied Pair-Exchange Algorithm for the Quadratic Assignment Problem," Paper No. 101, Center for Naval Analyses, Arlington, Va. (1972). [18] Heider, C. H., An N-Step, 2-Variable Search Algorithm for the Component Placement Problem. Naval Research Logistics Quarterly 20 699-724 (1973). [19] Hillier, F. S., and M. M. Connors, "Quadratic Assignment Algorithms and the Location of Indivisible Facilities," Management Science 13 42-57 (1966). [20] Khalil, T. M., FaciUties Relative Allocation Technique (FRAT), International Journal of Production Research 11 183-194 (1973). [21] Krarup, J., "Quadratic Assignment" Data 5 (1972). [22] Lawler, E. L., "The Quadratic Assignment Problem," Management Science 9 586-599 (1963). [23] Lawler, E. L. "The Quadratic Assignment Problem: A Brief Review," in Combinatorial Pro- gramming: Methods and Applications ed. by B. Roy (Dordrecht-Boston : Reidel Publ. Co., 1975) . [24] Miiller-Merbach, H., Optimale Reihenfolgen. 158-171 (Springer, Berlin— Heidelberg— New York, 1970). [25] Nugent, C. E., T. E. Vollmann, and J. Ruml, "An Experimental Comparison of Techniques for the Assignment of FaciUties to Locations," Operations Research 16 150-173 (1968). [26] Pape, U., and B. Schon, "Verfahren zur Losung von Summen und EngpaB-Zuordnungspro- blemen," Elektronische Datenverarbeitung 4 149-163 (1970). [27] Parker, C. S., "An Experimental Investigation of Some Heuristic Strategies for Component Placement," Operations Research Quarterly 27 71-81 (1976). [28] Pierce, J. F., and W. B. Crowston, "Tree Search Algorithms for Quadratic Assignment Pro- blems," Naval Research Logistic Quarterly 18 1-36 (1971). 148 R- E. BURKARD & K. STRATMANN [29] Pollatschek, M. A., H. Gershoni, and Y. T. Radday, "Optimization of the Typewriter Key- board by vSimulation," to appear in Angewandte Informatik (1977). [30] Steinberg, L., "The Backboard Wiring Problem: A Placement Algorithm," SIAM Review 3 37-50 (1961). [31] Stratmann, K. H., "Numerische Untersuchungen iiber Quadratische Zuordnungsprobleme," Diplomarbeit, Universitat Koln (1976). STEADY STATE WAITING TIME IN A MULTICENTER JOB SHOP Asha S. Kapadia and Bartholomew P. Hsi School of Public Health ' The University of Texas Houston, Texas ABSTRACT This paper analyzes the waiting-time distribution of a specific Job as it moves through a job-shop with multiple centers and exponential service times. The move- ment of the job through the shop is governed by a Markovian transition matrix and ends with the job's exit from the shop. INTRODUCTION In recent years considerable effort has been expended in modeling the operations of a job-shop. The job-shop operation is a complex dynamic system with a continuous flow of orders. Certainly, the most important analytic result in the analysis of stochastic job-shops is the decomposition theorem developed by Jackson [5, 7]. This give sufficient conditions, under which a general network of queues may be treated as an aggregation of independent queues. He demonstrated that the behavior of each machine center is stochastically independent if the (1) input to the shop is Poisson (2) routing of jobs is determined by a probability transition matrix, (3) processing rates are ex- ponential, and (4) dispatching rule is indepenent of a job's routing and processing times. Jackson [6] further studied the single service queuing system where customers are subject to static or dy- namic priority disciplines. Although practical applications of such queue disciplines in a job-shop type situation have not yet been reported, it appears to be a very appropriate discipline for that environment. Reinitz [8] constructed a mathmatical model of machine centers within a job-shop to obtain the overall performance of the shop. He assumes that r different types of jobs arrive at a machine center according to a Poisson distribution. Each machine center with m identical machines having n/m service rate, is approximated to a single channel system with service rate n. The state of the system is defined by the type of job in service and the number of class p jobs (p=l, 2, . . ., r) waiting for service. Transitions from one state to another are made according to a transition proba- bility matrix (p<_,). The mean processing rate of jobs is ju^a if jo^ Z: follows job p where y,t=l, . . .,r. The processing time of a class p job includes the teardown and setup operations of the next job k. The entire job-shop is therefore described by a set of Markov chains — one for each machine center. If Chapman-Kolmogorov equations are used, the differential equations for the waiting time dis- 149 150 A. S. KAPADIA & B. P. HSI tribution of jobs are obtained which when complete with Howard's [4] techniques can generate optimal due dates. A detailed survey of job-shop scheduling results are provided by Conway, et al. [3] and Baker [1], who also identify several interesting problems that still lack appropriate solution or analytic techniques. This paper considers the job-shop operating situation similar to the one analyzed by Reinitz and derives the waiting-time distribution of a specific job as it moves through the job-shop. This waiting-time distribution leads to the determination of the average time the job spends in the system. These results may be useful in predicting, for example, the time required to enact a proposed legislative bill as it is processed in various executive and legislative branches of the government. Another example, which generates considerable interests in the health industries, is the waiting time of an outpatient in a hospital or a clinic. The patient may be considered as a job as he is directed through the waiting rooms, the physician's office, the laboratories, radiology, pharmacy, and the accounting or cashier's oflBce. MODEL Consider a job shop with A^ machine centers, each machine center offering a different kind of service. The number of machines at center iisnii, each machine having an exponential service rate of fit jobs per unit of time. If a job on arrival at a particular machine center finds all the machines busy, it waits for its turn in a queue. The queue length at any job center may be infinite. A job may or may not go through all the machine centers in order to be completed. Once the job is completed, it exits from the shop and, in Markov chain terminology, enters the "trapping state" from which no transitions back into the shop are allowed. The system is said to be in state i if the job is in machine center i. The job is in state T (trap- ping state) when it is completed. Let ptj be the probability of a one-step transition from state i to state J {i,j=\, 2, . . ., A^^, T) and S{n) be the transient state of a job at the nth transition. Thus, the transition matrix P can be expressed as P=[Pij] i,j=l,2,...,N,T with ;' and Prj=0iiJ9^T. Let (tti, a2, . . ., a^r) represent the initial state probability vector, i.e. ai=Prob. (a job entering the job shop goes to machine center i). We assume that N ^ai=l {aT=0). Let «^,;(/i) = Prob. {S{n)=j\SiO)=i}, i.e. the probability that a job will be in state j at the nth transition, given that it was in state i at the initial entrance. Clearly, «^</n) is the (i, j)'" element in the matrix P". Denote the 2-transform of 0<^(n) by <t>i/(z);i.e. 71 = WAITING IN MULTICENTER JOB SHOP 151 Then, following Burke ([2], p. 10), we derive the quantities <f>ij'^(z) for i,j=l, 2, . . ., A^, T through the matrix equation where [(pt/iz)] is the matrix of the elements <pij'^{z), I is the identity matrix, and P is the transition matrix; all are of dimension {N-\-l)X{N+l). RECURRENT FREQUENCIES Let Uij denote the number of times a job enters state j before completion, given that the initial state is i. In order to find the expected value of riij, we define an indicator variable Xij{n) such that Xii{n)=\ \iS{n) = j, given S{0)=i =0 if S{n)9^j, given SiO)=i. By definition, <f>,j{n)=FToh. {X,,{n) = l\S(0)=i}. It follows that One may express n<^ in terms of Xain), i.e. 00 ni}=^Xij{n). Therefore the expected value of na is E[n,,}^E±,X,,{n)=±, <l>uin) For the variance and covariance of ntj and naiJT^k) we express E{n,jn,,)=± ±. E[X^j{m)X,,{r)]. OT=0 r=0 The product Xij(m)Xik(r) takes on the value 1 if and only if the job is in state j at time m and in state k at time r (or exchange of the indices m and r). The expected value of the product is E[X,,(m)X^{r)] =Prob. [S{m)=j, S(r)=k\SiO)=i] =<f>ij{m)<t>jk{r—m)+<l>iicir)<l>t){m—r); when kf^j =(l)ij (m) <t>jj (r- m) ; when k=j, where <f>ji,{n)=<f>^j{n)—0 when w<0. It is easily seen by (1) that 00 00 (2) E{n,jn,,)= Z) Z) ^[Z«(m)X«(r)] m=0 r=0 152 A. S. KAPADIA & B. P. HSI Therefore, Var (n,,) =<^,/ (1)0,/ (1) - [<^,/ (1)P. STEADY STATE EQUATIONS We assume that the arrival of jobs in the machine centers follows the Poisson law with rate X. Then the arrival rate into the center i (t= 1, 2, . . ., N, T) can be expressed as a set of simultaneous equations if the system is in a steady state, i.e. X< is also the expected discharge rate for machine center i. X*=Xa,+i: \jPju i=l, 2,...,N Aj'^ A. AVERAGE TIME THROUGH THE SYSTEM To obtain the averge time through the system before a job is trapped, we introduce the following variables : Wi=time spent by a job each time it is in machine center i (including the time for waiting and being served) tij=th.e time taken for a job to go from machine center i to machine center y, constant for i, j 771 1= number of machines at center i Xi= arrival rate into machine center i fii=seTyice of a machine in center i (the service time distribution is assumed exponetial, independent, and identical for all m^ machines) Pi= traffic intensity at center i—{XjmiHi) P<,= steady state probability of no jobs at machine center i trrn-l -1-1 S (X,/Mi)Vi: + (X,//zi)"'^(l-p*)-Vmi:J PTOi=steady state probability of there being 7n< jobs at machine center i=Po{Xi/fii)"''/mi'.. The expected value and variance of Wt are analogous to that of the waiting time in the M/M/nii queue with X < arrival and n « service rate and are given as [8] : (4) £:(w,)=w<=F„Jm,Mi(l-Pi)']+Mi-' Now let t be the total time for the job to go through the system. Then i=l L >=i J i=l L ;=1 J ' Ui)=tij+Wj- WAITING IN MULTICENTER JOB SHOP 153 If we assume Uij and Wj are independent in a steady state, then N _ N ^ _ _ i=\ i=l i=l N r_ ^ _ _ "j _ _ 1=1 L j=i J The variance of t can be written as two terms: Var (0=Var '^at (wi + ^UijUtA N = ^ aj2 Var fwi+S '^i>^io~]+SS «<«<' Gov r(Wi+Sn<;^*u). C^i'+S ^i'^i^i-A"! If we assume that Gov {Wi, Wi')=0 and ttj is constant for given i and j, the first term can be written as N N 2] tti^ Var Wt+S riijUij h=S <Jt^ Var (Wi)+Var XI ntjUtj+Cov (wt, '^UtjUi^) jv _ _ =X) tti^ Var (w<)+2 Var (n,^w„)+X) S Gov (ti^^w^. ntj.Uiy)+E{nu) Var (w^) •=i L 7 }=}' J =S a/ fVar {w^)+E{nu) Var (w,)+I] ^^(n,/) Var (wy)+Z; w^/ Var (n,,) '=1 L j j +S S UfjUty Gov (tio, rJ-oOl- If we multiply out and cancel, the second term is X 2 ata^. [Ein,>t) Var (w,) + £(n»0 Var (w.O+Z) Ein^,) E{n,-,) Var (w,)"| =J: X; a,a,- r2^(n,,0 Var (wr) + X fi-Cn,,) £;(n,<,) Var (w,)1 ii^i' Li J Substitute in the result from (2) and (3) and combine the first and second term : 1=1 L j j where (t>ij=<f>ij'^ (l) for any i and j. CONCLUSIONS The model for a multicenter job-shop with random arrival and random service rate has been presented. This model differs from that of others in that none of the authors mentioned in the introduction have dealt with the waiting-time distribution of a job through the job-shop or fol- lowed a tagged job through its completion. 154 A. S. KAPADIA & B. P. HSI REFERENCES [1] Baker, K. R., Introd^iction to Sequencing and Scheduling, (John Wiley 1974). [2] Burke, P. J., The Output of a Queuing System," Operation Research, 4, 699-704 (1956). [3] Conway, R. W., Maxwell, W. L., and Miller, L. W., Theory of Scheduling (Addison Wesley, 1967). [4] Howard, R. A., Dynamic Programming and Markov Processes (MIT Press, 1960). [5] Jackson, J. R., "Network of Waiting Lines," Operations Research, 5, (4) (1957). [6] Jackson, J. R., "Some Problems in Queuing with Dynamic Priorities," Naval Research Logistics Quarterly, 7, (3) (1960). [7] Jackson, J. R., "Job Shop-Like Queuing Systems," Management Science Research Report, Research Report No. 81, UCLA (January 1963). [8] Reinitz, R., "On the Job-shop Scheduling Problem." in Industrial Scheduling. J. Muth and G. L. Thompson (Eds) (Prentice Hall, Inc., Englewood Cliffs, N.J. 1963). A MATHEMATICAL PROGRAMMING APPROACH TO THE SCHEDULING OF SORTING OPERATIONS Frederic H. Murphy Department of Energy Washington, D.C. Edward A. Stohr Northwestern University Evanston, Illinois ABSTRACT In this paper we describe an approach to the scheduling and/or real-time control of sorting operations in the presence of deadlines. The problem arises in the postal service where mail has to be sorted by zip codes, and in the banking system where checks have to be sorted according to the bank on which they are drawn. In both applications losses are incurred if items miss their clearing deadlines. For example, in check-sorting an extremely important objective of the control system is to reduce the "float" i.e., the total dollar value of the checks which miss their deadlines. The proposed real-time control system utilizes a linear program which chooses between alternative sort-patterns and assigns the various processing steps to the time periods between deadlines. 1. INTRODUCTION In this paper we are concerned with the design of optimal control systems for the sorting of documents by computer-controlled sorting machines. The problem has great economic importance since it occurs both in the postal service where mail has to be sorted by zip-code, and in the banking system where checks have to be sorted by the bank in which they are deposited for return to the banks on which they are drawn. A discussion of the mail-sorting problem is given by Horn [3], while a good description of an actual computer system for real-time control of check- processing operations is given by Banham and McClelland [2]. In both the postal and banking applications, computer-controlled reader-sorters are employed which read the documents by the use of either optical character recognition techniques or magnetic ink character recognition techniques. The documents are then directed by the machine to a particular pocket or hopper based on their identification code. Since the number of final destinations for the documents far exceeds the number of pockets available on the sorter, many documents must be passed through the sorter more than once. Batches of documents arrive at random times through the day. The sorters group them according to their endpoint destinations. In the postal 155 156 F. H. MURPHY & E. A. STOHR application, the endpoints are associated with zip code regions; in the banking application, the end points may be a collection of banks within a region, a Federal Reserve Bank, or a single bank which must be sent a large volume of checks. The sorting process is subject to a number of clearing deadlines, and the performance of the system is closely related to the number and/or value of the documents which miss their deadlines on each day. For example, in check-sorting applications an important objective of the processing system is to minimize the total dollar value of checks which miss their deadlines, since one day's interest will be lost on these checks. We now present some terminology and review the relevant literature. During processing, the items are read into a computer-controlled sorter. At each pass, a code which translates to the end- point is read off each item and the item is sent to a specified pocket. Sorters are available with different numbers of pockets. However, since the number of endpoints is substantially larger than the number of pockets, many items must go through multiple passes. This means that on early passes many endpoints have to be grouped into the same pocket, then broken down into subgroups and finally into individual endpoints. Since some endpoints have substantially higher volumes of items than others, it is clear that these endpoints ought to be separated before the low-volume endpoints. For a given batch of items (documents) containing n endpoints and a sorter with m pockets, the sorting process can be represented by a tree. For example, if m=3 and the number of endpoints n=13, the tree may look like either of the trees in Figure 1 (the meaning of the letters shown above the leaves of the trees will be explained later). (a) (b) FiGUBE 1 All nodes connected to a single arc are called "exterior" nodes [4] and represent distinct endpoints for checks. All other nodes (the "interior" nodes) represent "rehandle" pockets, i.e. a pass of a subgroup of checks through the sorter. Since we can always add endpoints of zero volume, we need only consider m-ary trees, that is, trees where exactly m arcs emanate from each interior node. The number of interior nodes which must be added is given by the following lemma: LEMMA 1 : Let i=n mod (w-1) and iiii>2, m— 1 if i=0, m if i=\. Then the number of zero-volume endpoints to be added to make the m-ary tree is given by m-j. For a proof of Lemma 1 see Knuth ([4], p. 590). From now on we assume that the m-j zero- volume endpoints are added to ensure an m-ary tree. SCHEDULING OF SORTING OPERATIONS 157 Let Vi equal the expected volume of items for endpoint i and let Wt equal their expected value for i=l, 2, . . ., m. We say that a tree is an A.-level tree if the maximum number of arcs from the root to an endpoint is h. A node is at level h if it is the hih node on the path from the root to this node. Note that an endpoint at level h goes through h-1 passes. For a given batch, the processing procedure is completely specified by a sort-tree as given above together with the sequence of passes or visits to interior nodes. An integer programming formulation of the problem of selecting a sort-tree (sort-pattern) to minimize the average number of passes per item is presented in Singh [8]. However, the model becomes computationally intractable for practical problems, since the number of integer variables increases very rapidly with the number of endpoints and the maximum number of levels specified in the tree. Nevertheless, the formulation provides helpful insight. In a previous paper [6], the authors have provided an algorithm which determines the class of sort-trees with minimum total processing time for a given batch. This algorithm works when the cost per pass of processing an item is positive. A restriction on the allowable number of levels in the tree can be included if required. The output of the algorithm is a tree of "Type 1" as shown in Figure 1 (a) where the endpoints are ordered by increasing item volume from left to right, and the internal nodes are grouped to the right on each level. Once a tree is chosen, a procedure to find the sequence of visits to the interior nodes which minimizes the average weighted time to visit each endpoint is given by Horn [3]. However, this objective function does not handle the case where there are specified deadlines for each endpoint. The sequencing of sorting operations on one machine is also discussed by Moore [5], who used simulation techniques to test standard scheduling procedures such as shortest operating time in a banking application. Here we assume that each endpoint has a single deadline. Let Ti be the time internal between the ^-Ist and the tth deadline. Define Bt, to be the value of endpoint i if it is processed during the tth.e interval. For example, in a check-sorting application this value is a function of the interest rate, of the expected total value of checks for endpoint i within the batch, and of whether the processing for endpoint ^ is completed before or after its deadline. In a postal application the value of the items for endpoint i may be defined as a weighted sum of the expected number of 1st class, 2nd class, and 3rd class mail items associated with the ith zip code area. Our objective throughout the paper is to maximize the total expected benefits. In Section 2 we formulate the processing problem in terms of an integer program which ex- tends the work of Singh [8] to allow for multiple batches and time sequencing of operations in the presence of deadlines. Although the formulation provides interesting insights into the combinatorial nature of the problem, it is computationally difficult. We therefore present a somewhat different linear programming approach to the problem in Sections 3 and 4. This approach can be easily implemented and can be used either to produce preplanned schedules or to provide real-time control. Finally, Section 5 provides an example illustrating the proposed procedures. 2. INTEGER PROGRAMMING FORMULATION The environment for our model is as follows. A number of batches of items arrive throughout the day. The arrival times and composition of the batches can be predicted with reasonable accuracy. The batches are to be processed using one or more sorters. For expositional purposes the sorters are assumed to have m pockets; however, the model can be generalized to allow for a multiplicity 158 F. H. MURPHY & E. A. STOHR of sorter sizes. If there is more than one sorter, the definition of T, is modified so that it equals the total available sorter time between deadlines t-l and t. The time to process a batch of items through one pass is composed of a fixed setup time c plus a variable time X, per item. For each batch b, let Bn'' denote the benefit associated with isolating endpoint i in period < and let g,* denote the expected volume of items for endpoint i. For expositional purposes we will assume that we wish to control the processing in a real- time mode, i.e. we monitor the system continuously and adjust our processing strategy accordingly. A pass of a batch through a sorter creates a set of completely separated endpoints and/or a new set of subbatches to be sorted subsequently. At any time during the day there is a set of batches waiting for processing to begin and another collection of partially sorted batches. No distinction need be made between these two types of batches since the sorting pattern for a subbatch can also be represented by an m-ary tree. After each pass of a batch through a sorter, a decision has to be made concerning the next batch to be processed and the sorting pattern to be used for the chosen batch. This problem is modeled by the following integer program. The solution of the integer program specifies the optimal sort pattern for each batch together with the time interval between deadlines in which each pass associated with an interior node is made. Processing may then com- mence on any batch assigned to the first time interval. We define the following indices : dn=dth. interior node at level h eit-i=eih interior at level h-l jh-2=j^^ interior node at level h-2. The variables for each batch h are: ^*'»-,— 1 if the ith endpoint is assigned to the eth interior node at level ^-1 otherwise 1 if z*e^ ^ = 1 and the tth endpoint is scheduled to be isolated in the time period between the <-lst and <th deadline otherwise 1 if the dt\\ interior node at level h is assigned to the eth interior node at level h-l otherwise 1 if^d^ej., = 1 and the pass associated with the c^th interior node at level h is scheduled for processing in the time period between the i-lst and ^h deadUne otherwise ^Li =" ™*^^^"™""^ number of endpoints associated with the eth interior node at level h-l ^* —■ the total processing time for the batch associated with the interior node defined by ^t f SCHEDULING OF SORTING OPERATIONS 159 The objective is to maximize the total benefits received: (1) max S 2 Z: S B.^'xl^ ,, subject to the following constraints: A time constraint associated with the period between each deadline: (2) sz;i:2:<....<e..,<r, n Ok eii-i A requirement that each batch is processed in some time period: (3) S <e,_, = v'.,e,_, A precedence requirement on subbatches: A requirement that each endpoint is processed in some time period: (5) S Xu,_,t<^u,., Constraints which define the processing times for the non-endpoint dk t A requirement that each endpoint be isolated : (7) S S xl^ =1 A-l «»-, *"' A requirement that the number of subbatches and endpoints isolated at each pass equals the number of pockets: ^ ' ■^5—' "k^A-i ' '»-i -r— ' 'k-i'k-2 Ok Jk-l A requirement that the dth interior node can be assigned to at most one non-endpoint node at the next lower level : (9) :p<.fk-<^ A requirement that the number of endpoints isolated not exceed the number of pockets available after assignment of the subbatches: (10) 2?.. >z; X?e,., I Finally, all variables are assumed to be nonnegative. Note first that constraints (2) and (6) are nonlinear. To determine the sort pattern for even a single batch without considering the time of processing, assuming 500 endpoints and a maximum of four passes per endpoint, we would need 2,000 variables. With 20 time periods, the number grows to 160 F. H. MURPHY & E. A. STOHR 40,000 variables for describing a single batch. As there are numerous batches processed in each day, this integer program is clearly impractical for either the planning or real-time control of the sorting system. 3. A LINEAR PROGRAMMING APPROACH TO SCHEDULING— PHASE 1 To obtain a practical scheduling system, we decompose the model described in the previous section into two phases. In the first, a set of candidate sort-patterns is generated for each batch according to the methods specified in [6] and described briefly below. Thus, in contrast to the formulation in Section 2, we no longer attempt to determine the optimal sort-patterns and the optimal processing sequence simultaneously. Although this simplifies the problem greatly, it would still be difficult to devise job-shop heuristics or rules which could adequately cope with the com- binatorial nature of the problem and the kinds of tradeoffs involved. In the second phase the candi- date sort-patterns are therefore incorporated in a linear program (described in Section 4) which chooses a sort-pattern for each batch from among those provided and assigns the processing of each interior node in the selected pattern to a time interval between deadlines. Ideally, the program should be solved every time a sorter becomes available for another pass. However, to reduce the computational burden a good compromise would be to run the program periodically or at the time of arrival of each new batch. As a further possibility, the program could be run periodically (say once per month) and used to set up target processing schedules — perhaps one for each day of the week. Thus, a wide range of implementation possibilities exist from real-time control to preplanned schedules. The latter possibility can be readily implemented since the algorithm for generating the alternative trees during Phase 1 exists [6], Phase 2 involves only linear programming, and no changes in existing operational procedures would be required except for the substitution of the computed optimal sort-patterns and schedules. The implementation of some form of real-time control would be helpful because it would allow the system to react to unexpected volumes of documents and other unexpected events. However, some adjustment to the existing computer and manual proce- dures would be required. We assume that it is possible to maintain a data base of the expected volume g,, and value Vt, of the items associated with each endpoint i in each batch on any given day of the week. These quantities can be updated using techniques such as exponential smoothing. By the use of this data base, a collection of sort-patterns can be prepared periodically as described below. These patterns are in turn used to generate the data (concerning the processing times associated with each internal node, etc.) required by the program in Phase 2. A sort-pattern for a batch or subbatch is represented by an m-ary tree, and the linear program can be used with any arbitrary collection of sort-patterns. Thus, as is the case in most document- sorting operations, only two or three patterns may be used — one for each major classification of batch types or division of the day (e.g. morning, afternoon, and evening). This restriction of the number of sort-patterns reduces the complexity of tracing the subbatches through the various processing stages and has been found useful in manual control systems. A greater diversity of sorting patterns is possible if the tracing operations are under computer control with instructions to the operators displayed on CRT devices. However, for computational reasons, it is still desirable to restrict the number of sort-patterns considered. One possibility is to use only sort-trees of "Type 2", which retain the property of minimum average processing time and which also have the property SCHEDULING OF SORTING OPERATIONS 161 that the number of set-ups required to completely process all the checks for the most imminent deadline is minimzed. Such a tree is shown in Figure 1(b) where the letters above each endpoint refer to the associated deadline, with deadline "a" being the most urgent and deadline "b" the next most urgent and so on. A Type-2 tree is obtained from a Type-1 tree as follows: Order the end- points at each level in the Type-1 tree from left to right by increasing time to deadline and within the group of endpoints for each deadline by decreasing value. Next, associate each internal node with the highest priority deadline in the subtree for which it is a root and move it to the left until an endpoint with the same or higher priority deadline is encountered. Note that the endpoints associated with the different deadlines tend to be grouped together. Similarly, endpoints with high expected dollar value tend to be grouped together. Also note that the number of possible Type-2 trees for each batch equals the number of deadlines associated with its endpoints. To summarize, the procedure in Phase 1 is to use data concerning the expected volume of items for each endpoint to generate a minimum total processing time (Type 1) tree for each antici- pated batch or set of batches of incoming items. This is done using the alogrithm given in [6]. The Type-1 tree for each batch is used to generate a number of Type-2 trees — one for each interval between time deadlines. This forms a basic set of alternative sort-patterns for each batch. To this basic set, management can add other likely candidates as required. 4. LINEAR PROGRAMMING APPROACH TO SCHEDULING— PHASE 2 In this section we assume that the first phase of the procedure has been executed and that a suitable set of sort-patterns (m-ary trees) is available. We now present the integer linear pro- gramming model which is to be used in the second phase. We define the following indices : r=rth pattern for a given batch s=sth interior node in a sort-pattern for a batch where s=l denotes the root of the m-ary tree and the other interior nodes are enumerated from left to right on successive levels in the tree. The data for the problem include the time intervals between deadlines, Tt, as before. In addition, the sort-patterns generated during the first phase provide the following information: i5*4,=the benefit obtained from isolating the endpoints associated with the sth interior node of the rth pattern for batch b in the time interval between the t-lst and tth deadline tr,j*=the total processing time associated with the sth interior node of the rth pattern for batch b (This can be calculated by the use of an equation similar to (6)) ^(r.s) = the set of interior nodes that are directly connected to the sth interior node in the rth pattern for batch b. We define the following variables : ^ I 1 if interior node s of the rth pattern associated with batch b is processed in time period t \ otherwise. As before, the objective is to maximize total benefits: (11) max X) S Z) S B',,,vl, 162 F. H. MURPHY & E. A. STOHR subject to the following constraints: Time constraints: (12) Z) 2 S ^./t'r.r < T, for all t Constraints arising from the precedence relations in the sort-patterns: (13) Z) <su>^ V ?;„, s ^P{r, s) for all t, b ^ u<t u<t A requirement that all interior nodes be sorted and that a single pattern is chosen for each batch: (14) ZIZ)^?« = 1 for alls, 6 r t The constraints (14) operate in the following manner. For s—l the constraints (13) and(14) select a single pattern for each batch from among those provided. For each pattern selected we wish to ensure that every interior node is sorted. Let r be a selected pattern for batch h. From (13), v)st=^ for r 7^ r. This means that equations (14) ars equivalent to (15) Zl«*,= l for alls. t By the following lemma these constraints guarantee that each endpoint is isolated. LEMMA 2: For a given number of endpoints, all sort-patterns contain the same number of interior nodes. PROOF: We have n external nodes and (say) A'' internal nodes. The number of arcs in the tree is then n-\-N—\, but since every internal node has m outwardly directed arcs, we see that mA'^= n-\-N—\ and m—\ which is constant. Lemma 2 implies that there is the same number of constraints of the form (15) for each sorting pattern. Therefore the constraints (14) guarantee the completion of the sort-pattern for each batch. It would be economically infeasible to solve this integer program on a repetitive basis. We therefore solve it as a linear program and provide decision rules as discussed below for dealing with any noninteger variables. To further reduce computation and storage requirements, we note that the constraints (14) are generalized upper bound constraints and that special computational tech- niques can be applied (see Agbadudu, [1]). This is especially relevant if the program is to be used for real-time control. In this context note that at the time the linear program is run, a number of batches and partially processed subbatches will exist, and it will be necessary to include the relevant data for these batches. As in the paper by Reiter [7], it is also possible to anticipate future batch arrivals by including artificial batches h, with 5*^, equal to zero if the batch is not expected to arrive during the current period t. Computational experience shows that there are relatively few noninteger variables in the final solution of the linear program. The noninteger variables which do appear come from two sources. The first is where more than one Type-2 sort-pattern is chosen for a batch. In this case it is nececsary to select one of the alternative patterns. A single pattern can be assigned to each batch h by choosing the pattern r with the maximum value of SjU*!,. After this process has been carried out for all SCHEDULING OF SORTING OPERATIONS 163 batches b, the linear program can be rerun starting from the previous optimal solution and with Vri I set equal to zero for the eliminated patterns. In the new optimal solution to the linear program, the only possible source of noninteger variables is the splitting of the processing of interior nodes into several time periods. Since it is unlikely that the processing times for the interior nodes will sum exactly to J",, one node may, in any case, be split between time periods without loss. A process for generating a reasonable schedule for the processing of the interior nodes in the current period (1=1) which also resolves any possible ambiguity arising from nonintegrality is as follows : (1) Initially consider only nodes with t'rsi = l in the optimal solution. (2) These nodes define a set of subtrees. For each internal node in these subtrees, compute the difference in benefits between processing the node in the current period and in the succeeding period. (3) Assign to each node the sum of the values obtained in step (1) for this node and all of its successor nodes (i.e. nodes in the subtree which cannot be processed before this node). (4) For each node, divide the quantity calculated in step (2) by the total time required to process the node. (5) Successively process the root nodes of the subtrees (and resulting subtrees, etc.), always selecting the root node with the highest value of the ratio computed in step (3) to be processed next. By the use of this procedure, the nodes with the greatest benefit per unit time will be scheduled first. This is beneficial if there is a positive probability of a delay or breakdown in the processing system. After this process has been carried out a certain amount of time will be left before the first deadline for processing the internal nodes with 0<v*s,<l in the optimal solution. To compute a tentative schedule for these nodes, steps (2) to (5) are repeated except that the values computed in step (2) are multiplied by the associated values of «j*j,. An example of this process is given in the next section. If the linear program is solved frequently, the losses due to this approximation procedure for dealing with noninteger solutions will be reduced. For example, if the procedure is used for real-time control and a new optimal solution is computed every time a sorter becomes available for another pass, it is only necessary to select the next (sub) batch to be processed. 5. ILLUSTRATIVE EXAMPLE For purposes of illustration we assume a processing horizon with three deadlines indexed 1,2, and 3. Processing is allowed to occur after the third deadline by replacing the equalit}'^ constraints in (14) by less than or equal to constraints. Batches 1 and 2 are assumed to be ready for processing at the start of the first time interval. Batches 3 and 4 are anticipated to arrive at the beginning of the second time interval and batches 5 and 6 are anticipated to arrive at the beginning of the third time interval. For simplicity, the data for the batches are divided into two categories with batches 1, 3, and 5 belonging to the first category and batches 2, 4, and 6 belonging to the second. The endpoint volumes and associated deadlines are shown in Table 1. 164 F. H. MURPHY & E. A. STOHR Table 1. Endpoint Volumes and Deadlines Endpoint 1 2 3 4 5 6 7 8 9 10 11 12 13 Category 1 9i 200 100 100 50 20 20 20 10 10 10 1 1 1 Deadline 3 1 2 3 1 1 1 1 2 2 3 3 3 Category 2 200 200 200 100 100 100 80 20 10 10 5 2 1 Deadline 2 2 1 3 2 1 1 2 2 1 3 2 1 1-1 BATCH 1 2-1 BATCH 2 1-2 BATCHES 2 AND 3 2-2 BATCHES 2 AND 4 2 2 3 9 10 11 2 2 2 3 1 1 8 9 12 11 10 13 2 2 2 \ 1 / 3 N 1 , / 1 1 1 1 2 5 \U 4 N^ 3 6 7 1-3 BATCHES 3 AND 5 2-3 BATCHES 4 AND 6 3 3 3 1 1 1 1 2 3 3 1 2 2 2 11 12 13 5 6 7 8 9 10 11 10 13 8 9 12 \ / 1 \ / \ / 2 3 \ / 1 1 1 2 2 2 \ 1 / \ / 2 \/ \/ 3 4 \ / 3 6 7 1 2 5 >/ \ ■V, k ^ b's >^ ^2 ■ 5 < c \ 3 \ >k /6 Figure 2 The alternative Type-2 sort-patterns for the batches are shown in Figure 2. Here the Type-2 sort- trees are identified by their category number and the time deadUne given the highest priority dur- ing their construction by the method outUned in Section 1. Thus, sort-tree 1-2 is associated with SCHEDULING OF SORTING OPERATIONS 165 category 1 endpoint data and time interval 2 (in this tree, the number of set-ups required to separate all the endpoints with the second deadline is minimized). In Figure 2 the internal nodes are num- bered from left to right on successive levels. Two numbers are written above each endpoint node. The lower number is the endpoint index and the upper number is the associated deadline. In the linear program the relevant sort-patterns for the first two time periods were specified as alternatives for batches 1 and 2 which arrive in the first time interval. For these batches the sort-pattern index r has values 1 and 2 corresponding respectively to the sort-patterns for the first two time periods. Similarly, the relevant sort-patterns for time periods two \r=\) and three (r=2) were specified as alternatives for batches 3 and 4, while only the sort-patterns for period three (r=l) were specified as alternatives for batches 5 and 6. To illustrate the indexing scheme, variable vXi^ is associated with the second internal node of the first pattern for batch 4 (sort-tree 2-2). If tA^z—'^ in the optimal solution to the linear program, then batch 4 is processed using the sort-pattern 2-2 and node 2 is isolated in period 3. In the illustrative example, the benefit for separating endpoint i of batch 6 in period t equals q^iiit is less than or equal to the deadline for endpoint i; it equals zero otherwise. Also, the average time to process an item through one pass X=l, and the set-up time for a pass c=10. If we use this information, the values of B\si and a,^ are readily computed. For example, 5^52=32 and (7i5*=42 (see Table 1 and sort-tree 2-2 in Figure 2). Finally, the time intervals for deadlines 2 and 3 are assumed to be r2=3000 and 7'3=3000. The linear program for this problem has 149 contraints and 132 variables. The optimal solutions for two different time intervals before the first deadline are shown in Table 2. The time constraints for periods 1 and 2 are tight while that for time period 3 is slack in the optimal solutions for both T'i=2500 and 72=2000. As might be expected, the reduction in time available before the first deadline increases the scatter of the processing of the nodes from a given sort-tree amongst the various time intervals. It can be seen from Table 2 that the Type-2 sort- patterns associated with the second deadline were optimal for the period-one batches in three out of four cases. In two of these cases, however, the period-one sort-patterns represented alternative optima. The grouping of endpoints by deadline in the Type-2 trees is usually advantageous. For example, the linear program used in the illustration was run 16 times with Type-1 sort-trees as alternative patterns for batches 1 and 2 (for values of ri = 1000, 1500, 2000, 2500 and with C=10 and C=100 for each value of Tj). Type-1 sort-trees were selected in preference to Type-2 sort-trees in only two of these cases. We now use the results of the linear program to compute a tenative processing schedule for time period one by the method described in Section 4. Since a unique pattern was chosen for the batches arrivhig during the first time interval, there is no need to adjust the solution in this respect. From Table 2 it can be seen that internal nodes 1 and 4 of sort-tree 2-2 in Figure 2 are the only nodes to be processed completely in period one according to the optimal solution of the linear program. From the precedence relationship, node 1 must be processed first, followed by node 4. The time taken for the two set-ups and to pass all the items in batch 2 through the first pass (node 1) and the items for endpoints 3, 6, and 7 in the second pass (node 4) is 1428 time units. Since ?'Li = 0.69 for s=l, 3, 5, and 6, nodes 1, 3, 5, and 6 of sort-tree 1-2 in Figure 2 become candidates for scheduling during the remainder of period one. The calculations to determine the best sequence of visits to these nodes according to the method given in the last section are set out in Table 3. 166 F. H. MURPHY & E. A. STOHR Table 2. Solutions to the Linear Program Batch Ti = 2500 Ti=2000 Nonzero Basic Variable Value Nonzero Basic Variable Value 1 ^111, ^121) ^Hl) ^15I> ^163 1 0.15 0.85 ^222) ^242 ^211) ^231 > ^25!) ^261 ^212) ^233i ^253 1 0.69 0.31 2 ^112, ^231) ^241, ^261) ^222) ^252 1 ^211) ^241, ^222, ^232, ^252) ^263 1 3 ^112) ^122, ^42, ^133, ^153 ^213) ^223) ^243 0.73 0.27 ^213) ^223, ^243 ^112) ^I22> ^142) ^133) ^53 0.79 0.21 4 ^112) ^122> ^132; ^'l52, ^163 1 ^112, ^122; ^132, ^152) ^163 1 5 ^113) ^123. ^143 1 ^113) ^123l ^143 1 6 ^113> ^123) ^153 1 ^113) ^123) ^153 1 Table 3. Scheduling Batch 1 in Time- Interval 1 (1) (2) Node s 0-2.' Difference in Benefits «^LiX(2)-^(l) 1 553 170 .21 3 182 170 .64 5 32 20 .43 6 60 50 .58 From the last column in Table 3, the tentative processing sequence for batch 1 is to visit internal nodes 1, 3, 6, and 5 of sort-pattern 2 in that order. However, only node 1 can be fully processed before the end of the first period and for a real-time control application it would obviously be desirable to rerun the linear program with updated information before that time. Thus it may only be necessary to invoke the heuristic procedure for resolving ambiguities caused by nonintegral solutions when the time before the first deadline is short. SCHEDULING OF SORTING OPERATIONS 167 6. CONCLUSION In this paper we have described a Knear programming approach to the scheduling and/or real-time control of sorting operations in the presence of deadlines. The linear program chooses between alternative sort-patterns and assigns the passes involved to the time periods between deadlines. Although any sort-patterns may be used as data, we have suggested that candidate sort-patterns be chosen from the class with minimal total processing time and that allowance be made for the presence of deadlines by regrouping the internal nodes by the method described in Sections. Although we have described a straightforward and intuitively reasonable approach to the scheduling problem, it is apparent that there is room for experimentation with the choice of a suitable data base of sorting patterns, the determination of the times at which the linear program is run, and the rules for making use of the Imear programming solution. These decisions might be tested and improved by incorporating the suggested algorithms for generating alternative sort- patterns, and the linear jirogramming approach to the selection of the next batch to be processed, as components in a larger simulation program. ACKNOWLEDGMENTS The authors wish to thank Chuck Cooper of American National Bank for introducing us to the problem and Phillip Ryczek of Continental Bank for further assistance. The paper has also benefitted greatly from the comments of the Editor and referees. REFERENCES [1] Agbadudu, Amos, "Generalized Upper Bound, Variable Upper Bound and Extensions for Large Scale Systems," unpublished Ph.D. dissertation, Graduate School of Management, North- western University. [2] Banham, J. A., and P. McClelland, "Design Features of a Real-Time Check Clearing System," IBM Systems Journal, No. 4 (1972). [3] Horn, W. A., "Single-Machine Job Sequencing With Tree-like Precedence Ordering and Linear Delay Penalties," SI AM Journal of AppUed Mathematics, Vol. 23, No. 2 (September 1972). [4] Knuth, D. E., The Art of Computer Programming, Vol. 1 (Addison-Wesley, Reading, Massa- chusetts, 1969). [5] Moore, L. J., "An Experimental Investigation of a Computerized Check Processing System in a Large City Bank Using Digital Simulation," Ph.D. Thesis, Arizona State University (Sep- tember 1970). [6J Murphy, F. H., and E. A. Stohr, "A Dynamic Programming Algorithm for Check Processing," Management Science (to appear). [7] Reiter, S., "A System for Managmg Job-Shop Production," University of Chicago, Journal of Business (July 1966). [8] Singh, B. J., "A Heuristic Approach to Solve a Large Scale Linear Programming Problem," pre- sented at the ORSA-TIMS Conference (Fall 1974). DEVELOPING AN OPTIMAL REPAIR-REPLACEMENT STRATEGY FOR PALLETS Christoph Haehling von Lanzenauer and Don D. Wright School of Business Administration, The University of Western Ontario London, Canada ABSTRACT A Markovian model is presented for the development of an optimal repair- replacement policy for the situation requiring a decision only at failure. The prob- lem is characterized by the presence of growth which is integrated into the formula- tion. The model is applied to an actual problem, with data analysis and results given. Substantial savings are indicated. I. INTRODUCTION The problem of determining when to repair and when to replace failing equipment is a concern of management of productive resources. Inefficient management due to the use of nonoptimal re- pair-replacement policies can have significant financial implications. The purpose of this paper is to describe analysis and results of a study determining an optimal repair-replacement strategy for wooden pallets under conditions of growth. At the time of the analysis a pallet, when damaged, was not repaired but disposed of, if saleable for a price of $1.50 to scrap dealers and was replaced by a new pallet costing $8.50. The policy was adopted on grounds ". . . that repaired pallets don't provide the same eflBciency and are therefore less economical ..." With the steadily rising prices for new pallets — the price had almost doubled since 1970 — and a saturation on the market for used pallets, a review of the existing policy was called for. The problem was therefore to determine an optimal repair-replacement policy by specify- ing under what conditions a damaged pallet should be repaired at an average cost of about $2.50 or be replaced by a new one. Damages to pallets occur due to some process which is a function of misuse, age, and history of repairs. As misuse is random, the parameters age and history of repairs can be considered as defining the state of a pallet. Thus, its life can be represented as a Markov chain. Various authors [1, 2, 4, 5, 6, 7] have used Markovian analysis to develop optimal inspection, preventative main- tenance, and replacement strategies. These models are not directly applicable to the problem under consideration. A decision is only required after failure, making the inspection and preventative maintenance aspects irrelevant. Furthermore, the growth aspect in the problem must be explicitly accounted for. 169 170 C. H. VON LANZENAUER & D. D. WRIGHT II. THE MODEL Various types of repair-replacement policies can be considered and include (a) Repair a pallet only if its age is less than k(k=0, 1,2,. . .) periods (b) Repair a pallet only s(s=0, 1, 2, . . .) times during its service life (c) Repair a pallet provided its age is less than k periods and the number of previous repairs is less than s. Of course, a damaged pallet may or may not be repairable. The above policies apply only to pallets which can be repaired. 1. Criterion In the development of an optimal repair-replacement policy, a criterion has to be adopted. The cash flows associated with any policy will arise from new purchases, repairs, and sales of scrap pallets. Since a policy will be applied on an ongoing basis, the expected cost per pallet per period at steady state is selected as decision criterion. The impact of the steady state assumption is ex- amined by investigating the transitional behavior. In choosing from different policies we can express the criterion formally by (1) E(C)=min {8.5 X{k, s)+2.5 Yik, s)-1.5 Z{k, s)] with X{k, s) : probability of a pallet being new Y{k, s) : probability of a pallet being repaired Z(k, s) : probability of a damaged pallet being sold as scrap when a policy with parameters k and s is used. In order to evaluate different policies, X{k, s), Y{k, s), and Z(k, s) must be determined. This is possible by modeling the stochastic behavior of a pallet. 2. The Stochastic Process With Growth For the policies to be considered, the age and the number of repairs of a pallet are the key variables. Let iO=0, 1, . . ., J) and r(r=0, 1, . . ., s) represent the age and the number of repairs, respectively. The indices j and /-define the state of a pallet. Let ir**(0 be the probability that a randomly selected pallet is in state ^r at the beginning of period t{t=l, 2, . . .) when a poUcy with parameters k and s is used. The length of period t is selected such that a pallet can be damaged only once. Let pllj^it) with i=0, I, . . ., J and g=0, 1, . . ., s be the probability that a pallet in state iq at the beginning of period t will be in state jr at the beginning of period <+l if a policy with parameters k and s is used. Since there is no reason to assume that the process by which pallets are damaged changes from period to period, we let the transition probability be independent of t. Thus, i)*«,^r(i)=P*«.ir- For the policies to be con- sidered, the transition probabilities are defined as (2) pn.jr=' PiiiX~^u) ifi=Oandr=0 Piq&t if i=i+l and r=2+l 1— Ptj if i=i-i-l andr=2 otherwise for i=0, 1, . . ., k—\ and g=0, 1, . . ., s— 1 (4) pn.ir= OPTIMAL REPAIR-REPLACEMENT STRATEGY l7l Pi, ifi=Oandr=0 l—Piq if 7=i+l and r=2 otherwise for i=k, . . ., J—\ and 2=0, 1, . . ., s and for i=0, 1, . . ., ^—1 and q=8 1 ifi=0 (4) P\lu=,^ ,, . otherwise for i=J and g=0, 1, . . ., 5, with Piq representing the probabiUty that a pallet in state ig will be damaged and /3i«(0</8<j< 1) being the fraction of damaged pallets in state iq that can be repaired. Note that the damage prob- ability of a pallet with 5+I repairs may be greater than, equal to, or less than the damage prob- abilities of a pallet with g repairs. In order to allow for growth, the number of pallets in the system at the beginning of period t, N{t) , can be expressed by (5) N{t) = il+g)Nit-l) with g being the growth rate. (5) can be rearranged as N{t-1) 1 (6) N{t) l+g Therefore, 7r*r(0 --^(0 is the number of pallets in state jV at the beginning of period t when a policy withparameters k and s is used. Since growth materializes thro\igh the purchase of new pallets (i.e. j=0 and r=0), 7r*^(0 -^it) can be expressed by (7) ior j=0 and r=0 (7) ir'o^{t)-N{t) = ^j:7r'il{t-l)-N{t-l)pn.oo-^9N{t-l) and by (8) for j>0 and r>0 i Q Since all states form a single ergodic class, the steady state probabilities exist. Thus ir*'(t) = 7r*'. Dividing (7) and (8) by N{t) and substituting by (6) leads to (9) ^^=j^J212^Upn.oo+v^„ ioTJ=0,r=0 and (10) ^u=Ar S S ^VqPU.jr for j>0 and r>0 i-rg i g which in conjunction with (11) z;z;tj?=i can be used to determine the steady state probabilities. Of course, for ^=0, the model (9), (10), and (11) reduces to the standard Markov chain. 172 C. H. VON LANZENAUER & D. D. WRIGHT The probabilities X{k, s), Yik, s), and Z{k, s) required in the evaluation of different policies according to (1) can be determined using the stochastic process tt*, and are expressed by (12) x{k, s)=z; i: i^YrPir-y{k, s) j=0 r=0 (13) Y{k,s) = 'T.j:irrrPjrPjr (14) Z{k, S)-i: iZ TrrPjr[{l-Pjr)c^Jr+M-Y{k, s) j=0 r=0 with a^r(0<ay,<l) being the fraction of unrepairable pallets that can still be sold as scrap. III. APPLICATION The model has been applied for the development of an optimal repair-replacement policy of wooden pallets. For a number of primarily administrative reasons only the policy of repairing a pallet if its age is less than k periods was considered. Although the model can handle any assump- tion, it is assumed that Pig=Piq+i. The assumption is made as no statistical information is available (the current policy is a no-repair policy).* 1. Data To apply the model the damage probability P< must be determined. (Note that the index g is not needed for the type of policy considered.) Pi was obtained by two separate random samples, one of undamaged and one of damaged pallets (Table 1). In light of the information available from the samples, very few (if any) of the pallets in the system were purchased prior to 1968. Thus, J=23. Furthermore, the length for period ^ was selected as a quarter, as pallets are identified on a quarterly basis. The impact of this assumption is investi- gated in the sensitivity analysis. Assuming that the samples are representative, P< can be estimated by (12) P,= ^ with M being the number of pallets damaged per quarter and A'^ the total number of pallets in the system estimated by management at 150,000. M can be estimated from the information of Table 2. The differences between purchases and scrappages suggest that the number of pallets in the system has been increasing. The growth per quarter can be estimated by the difference in the average number purchased and the average number sold as scrap, i.e. 16,756—9,549 = 7,000. Thus y= X^m "^-^^ °^ ^^^ ^"^ M= 10,000. Management, however, felt that the number of pallets was constant and the discrepancy could be attributed to reasons such as pallets being lost in the system, pallets being lent outside the system, ♦This assumption is supported by discussions with operating people indicating that repaired pallets, if anything, tend to be somewhat stronger than nonrepaired pallets of comparable age. OPTIMAL REPAIR-REPLACEMENT STRATEGY Tablk 1 . Pallet Sample Results 173 Date of manufacture Fraction of un- damaged pallets Fraction of damaged pallets Year Quarter m, Ui 73 4 3 2 1 0.1847 0.0944 0.0792 0.0597 0.1746 0.0238 0.0106 0.0212 72 4 3 2 1 0778 0.1000 0.0556 0.0472 0. nil 0423 0.0608 0.0529 71 4 3 2 1 0.0597 0.0417 0.0208 0.0042 0.0714 0.0556 0.0741 0. 0370 70 4 3 2 1 0.0347 0.0250 0.0264 0.0250 0.0529 0.0291 0.0265 0.0159 69 4 3 2 1 0.0111 0.0111 0.0069 0.0042 0.0397 0.0132 0.0238 0.0159 68 4 3 2 0.0208 0.0069 0.0028 0.0238 0.0106 0.0132 1 . 0000 1.0000 Sample size 720 378 Table 2. Pallet Purchases and Scrappages Period Purchased Sold as Scrap June- August 71 19, 840 7,550 September-November 71 13, 220 9, 580 December 71-February 72 9,620 8,370 March-May 72 29, 230 11, 510 June-August 72 24, 245 10, 260 September-November 72 6,560 9, 080 December 72-February 73 11, 190 7,900 March-May 73 14, 930 10, 040 June-August 73 29, 350 13, 220 September-November 73 11, 505 8,780 December 73-February 74 14, 640 8,750 174 C. H. VON LANZENAUER & D. D. WRIGHT only a fraction of damaged pallets being sold for scrap, or a combination of these. This implies ^=0 and M= 17,000. The results are therefore given for both situations. An exponential curve provided a reasonable fit to the values of Pi according to (12) resulting in:* (13) P,=0.03694e '^»'"' for y=4% and M=10,000 and (14) P,=0.04776 e °»^«^' for gf=0% and M= 17,000. Based on discussions with operating people, about 15% of the damaged pallets were unrepairable. Thus, )3i=0.85 for all i. According to existing market conditions, unrepairable pallets had no value, implying a<=0 for all i. 2. Results Table 3 represents the results for both the growth and no-growth situation. Table 3. Expected Cost per Pallet per Critical Age k Quarter Growth No Growth $0. 536107 $0. 749660 1 0. 520751 0. 731418 2 0.506711 0. 714818 3 0. 493914 0. 699835 4 0. 482293 0. 686449 5 0. 471789 0. 674639 6 0. 462348 0. 664384 7 0. 453923 0. 655668 8 0. 446471 0. 648476 9 0. 439955 0. 642791 10 0. 434343 0. 638602 11 0. 429607 0. 635898 12 0. 425726 0. 634673*- 13 0. 422683 0. 634921 14 0. 420467 0. 636639 15 0. 419080 0. 639829 16 0. 4 18529 <- 0. 644500 17 0. 418839 0. 650674 18 0. 420056 0. 658394 19 0. 422254 0. 667746 20 0. 425554 0. 678899 21 0. 430149 0. 69219 22 0. 436342 0. 708321 23 0. 444625 0. 728831 *0f course, the results could be carried out for P, according to (12). However, the authors feel that the fitted values for Pi better represent the true underlying process. OPTIMAL REPAIR-REPLACEMENT STRATEGY 175 The minimum cost policy with growth is to repair a pallet if its age is less than 16 quarters. Relative to the no-repair policy, expected annual savings are in the order of ($0.536107 — $0.418529)-4-150,- 000-(l+6')' — $70,546.80-(l-|-^)'. The minimum cost poHcy with no growth is to repair pallets if their age when being damaged is less than 12 quarters. Relative to the existing no-repair policy, annual savings in the order of ($0.749660 — $0.634673) •4150,000 = $68,922. 20 can be expected. Based on the above results, a pallet should be repaired if its age is less than 16 quarters for the growth assumption and 12 quarters for the no-growth assumption. Since it is not certain whether the growth or no-growth condition actually exists, implementing the wrong decision will result in opportunity losses. These opportunity losses can easily be determined and are given in Table 4. Table 4. Opportunity Losses /Pallet /Quarter Critical Age A; Growth No Growth 12 $0. 425726 -0.418529 $0. 007197 16 $0. 644500 -0.634673 $0. 009827 We can conclude that the policy with critical age k= 12 is superior to the policy with k= 16 if P (growth). 0.007197<[1-P (growth)] -0.009827. Thus, the policy with k=l2 is preferable if the probability of growth is approximately less than 0.6, while the policy with ^=16 is better for values greater than or equal to 0.6. Based on manage- ment's belief that the number of pallets in the system is not growing, the policy of repairing pallets with age less than 12 quarters should be implemented. The maximum opportunity loss per year for the entire pool of pallets from using the wrong policy when the growth condition exists is $0.007197 •4- 150,000 (l-}-^)'~$4,300.00 (l+g)'. The maximum opportunity loss per year from using the wrong policy when the no-growth condition exists is $0.009827 •4- 150, 000~;$5, 900. 00. In both instances, the opportunity losses are limited, indicating that the financial consequences of making the wrong decision are not overly severe. Since the actual condition may also fall somewhere between a growth of g=0% and ^=4%, policies for values of k between 13 and 15 quarters could be investigated. Such refinement, however, does not appear to be warranted in view of the rather limited opportunity losses. 3. Sensitivity Analysis The sensitivity of the results obtained were investigated with respect to various assumptions and certain pieces of information, but only a few are reported. 176 Transitional Behavior C. H. VON LANZENAUER & D. D. WRIGHT The above analysis has been carried out under the steady state assumption. Figure 1 sum- marizes the transitional behavior of implementing the optimal repair-replacement policy for both the growth and no-growth condition, assuming the system under the existing no-repair policy is in steady state. The transitional behavior is expressed by the expected cost per pallet and quarter. As can be expected from the structure of the problem, the time required to reach steady state is large. The fact of considerably lower than steady state cost during the first two years (no growth) and three years (growth) is an added incentive for implementing the optimal policy. 10 20 30 Figure 1. Sensitivity Analysis: transitional behavior. Multiple Damage per Quarter The model developed above assumes that a pallet can only be damaged once per quarter. Under the no-repair policy the events "damage" and "no-damage" represent a binomial trial. If we allow for multiple damages per quarter, the process can be represented by a Poisson distribution. The probability of d=0 damages per quarter for a pallet of age i can be expressed by (15) and equated to l—Pt. Thus (15) with ^1 .-X. P((^=0|X,)=^e-^'=e-^*=l-P< X,= -ln (l-P,). Naturally, the multiple damage possibility is only relevant for pallets whose age is less than k quarters, as a damaged pallet aged k or more quarters will be replaced. Thus, the maximum value X can take on is and Xi6=-ln (1-0.121930)=0.130 X„=-ln (1-0.134123)=0.144 with growth with no growth. OPTIMAL REPAIR-REPLACEMENT STRATEGY 177 The probability of two or more damages per quarter is, in both situations, approximately 0.01 and small enough to justify the assumption of at most one damage per quarter. Number of State Variables Based on the survey information, the above analysis was carried out for J=23, which implies that no pallet will be older than 6 years. Extrapolating the exponential functions (13) and (14) allows us to evaluate the system when t7>23 and to determine the effect of restricting J to 23. The results for a maximum age of 10 years (i.e., 40 quarters) are given in Table 5. Table 5. Sensitivity Analysis: Number of State Variables Number of States Growth No Growth Critical Age k Savings per Pallet/Quarter Critical Age k Savings per Pallet/Quarter 24 40'' 16 18 $0. 117578 $0. 122530 12 12 $0. 114987 $0. 115322 " It should be noted that no pallet was older than 33 quarters under the no-growth assumption. As can be observed, the critical age and the expected savings relative to the no-repair policy do not change for the no-growth assumption and vary only marginally for the growth assumption. Restricting the age of a pallet to a maximum of 6 years appears to be justified. Substitution of More Durable Pallets An interesting issue to analyze is the possibility of replacing all wooden pallets by pallets of more durable and less breakable material such as plastic. This alternative is advisable if the cost of maintaining a plastic pallet is less than the same cost for a wooden pallet under the existing no-repair policy. From the above analysis we know that the expected annual cost of maintaining a wooden pallet under the no-repair policy is $0.536107 •4=$2. 14: growth $0.749660 •4=$3.00: no growth. In the extreme case a plastic pallet might be undamageable and would thus last forever. Therefore, only the purchasing price is relevant. The alternative of replacing the wooden pallet is advisable if the cost of purchasing a plastic pallet, R, is less than the present value of all future costs of main- taining a wooden pallet under the no-repair policy. Thus «<S(lT-J $2. 14: growth $3. 00: no growth 178 C. H. VON LANZENAUER & D. D. WRIGHT with p being the cost of capital. Since f^\l+p/ -P the cost of purchasing a plastic pallet with p=0.15 must be less than $2. 14 -^=$14.25 .growth or $3.00-— j-r= $19.98: no growth respectively, to make the use of plastic pallets economically advisable. Since the cost of a plastic pallet is currently around $25.00 and such pallets certainly do not last forever, the alternative of replacing the pool of wooden pallets by plastic pallets is not recommended. III. CONCLUSION The purpose of the paper was to develop an optimal repair-replacement policy for wooden pallets. The problem has been modeled using Markovian analysis with growth for a variety of conditions. The model was applied and, as shown, substantial savings were indicated. BIBLIOGRAPHY [1] Derman, C, "On Sequential Decisions and Markov Chains," Management Science, 9, 16-24 (1962). [2] Drinkwater, R. W., and N. A. J. Hastings, "An Economic Replacement Model," Operational Research Quarterly, 18, 121-138 (1967). [3] Hillier, F. S. and G. J. Lieberman, Operations Research 2nd Edition, (Holden Day, Inc., San Francisco 1974). [4] Klein, M., "Inspection-Maintenance-Replacement Scheduling Under Markovian Deteriora- tion," Management Science, 9, 25-32 (1962). [5] Kolesar, P., "Minimum Cost Replacement under Markovian Deterioration," Management Science, 12, 694-706 (1966). [6] McCall, J. J., "Maintenance Policies for Stochastically Failing Equipment: A Survey," Man- agement Science, 11, 493-524 (1965). [7] Pierskalla, W. P. and J. A. Voelker "A Survey of Maintenance Models: The Control and Surveillance of Deteriorating Systems," Naval Research Logistics Quarterly, 23, 353-388 (1976). ON CONVEXITY PROOFS IN LOCATION THEORY Robert F. Love University of Wisconsin Madison, Wisconsin James G. Morris Kent State University Kent, Ohio ABSTRACT It is often assumed in the facility location literature that functions of the type <l>)(x, 2/)=/3,[(a;, — x)2+(j/, — 2/)']'^/2 aj-g twice diflferentiable. Here we point out that this is true only for certain values of K. Convexity proofs that are independent of the value of K are given. EVTRODUCTION Reference [1] provides a proof of the following theorem: THEOREM 1 : The function is convex. The proof is based on the assumption that <t>(x, y) is a twice differentiable function for all values of x, y in an open convex set [4]. Reference [3] provides a proof that the necessary condi- tions (partial derivatives equal zero) are also sufficient to ensure a minimum of the function z{Xu . . ., Xn, yu . . ., yn)=lZ S ^O [(a;,-X,)^+ (l/,-?/,)^] '/^+C. i=l j=l The proof is based on the unstated assumption that the second partial derivatives of z exist for all values of (xi, . . ., x„, y^, . . ., y„). This assumption is also made in [2]. It is the purpose of this note to show that the assumptions that <t> and z are twice differentiable are not valid everywhere and to present alternative proofs which do not depend on the existence of derivatives. DIFFERENTIABILITY Consider one component of <f>(x, y) denoted by <i>j{x, y), given as <f>j(x, y) = [{Xj—xy-]-{yj— yY]^'^. (We let /3;=1 with no loss of generality for the purpose at hand.) A direction in E'^ is given 179 180 R- F- LOVE & J. G. MORRIS by n={ni, /J2), where |m| = 1- Let the first directional derivative of <f>j at any point ae^^^ be given by D^<f>j(a). Then for aj={Xj, yj), if X is some real number, Z?,0, (a,)^lim <^.(^^+^M)-<^.(a.) X-*0 A. x-*o X \->o X It is useful to consider three cases when taking the limit; 0<CK<Cl, K=\, and K^\. CASE 1, 0<i^<l: X—O A =lim(signX)|xr-iW+M2r'^ X— + o°,X>0 -00, X<0. CASE 2, /iC=l: (This case has been reported in [5].) 'w+ti2'y'\ x>o, CASE 3, A:>1: D.4>Aa;): ^_(,,.+^^3):. x<0. Z>.,,(a.)=lim^^(-^tif^^ X->0 A =]im(signX)ixr-HMi'+M/)'''' x-o =0. In Cases 1 and 2 the first directional derivatives of </> do not exist at any fixed point a^; only in Case 3 do the first derivatives of exist everywhere on E^. Even in this case, however, the second derivatives curtail the usefulness of convexity proofs based on the derivatives. Let d<j>jjdx be given hy g{x, y), where g{x,y) = -K{x,-x){{x-xy-\-{y-yyY^''^-K Then Z>,g(a,)^lim ^(^^+^^^-^(^^^ x->o X _^.^ gXMi[(XMi)^+(XM2)']^^^^'-^ X—O X x-»o '+oo,o<^<2 .0, K>2. ON CONVEXITY PROOFS IN LOCATION THEORY l&l Thus <f>{x, y) is not twice differentiable over all of E^ for 0<iC<2. Convexity proofs which do not require differentiability are required for K<i2. In a similar way it can be shown that the deriva- tives of z are not everywhere continuous [6]. ALTERNATIVE PROOFS OF CONVEXITY Proof of Theorem 1 : Let n Since a nonnegative linear combination of convex functions is convex, </> is convex if each jj is convex. From Love [5] we have that hj{x, y) = [{Xj—xy+{yj—yYY'^ is convex. Also, hj{x, y) is nonnegative. It follows that/j(a;, y) = [hj{x, y)Y, K^l, is convex since an increasing convex func- tion of a convex function is convex. THEOREM 2: The function n n zixu . . ., x„, yi, . . ., 2/„)=-S S PiAi^j-XiY+iyj-yiW^+c is convex. PROOF: It is sufficient to show that/<y(Xi, Xj, y<, y^) = [(a;^—a;<)^+(2/i~2/<)^]''^^ is convex. In Love and Morris [6] it is shown that r 2 -ii/p lv{q,r)=^\^^Wi-ri\''\ p>l, is convex. But_/(^(a;i, Xj, yi, yj) is (^{q, r) if g and r are defined as (x^, y^) and {xt, yi), re- spectively. REFERENCES [1] Cooper, L., "An Extension of the Generalized Weber Problem," Journal of Regional Science, 8, 181-198 (1968). [2] Cooper, L., "Heuristic Methods for Location- Allocation Problems," SI AM Review, 6, 37-53 (1964). [3] Dokmeci, F. V., "An Optimization Model For a Hierarchical Spatial System," Journal of Regiona Science, 13, 439-451 (1973). [4] Eggleston, H. G., Convexity (Cambridge University Press, Cambridge, England 1963). [5] Love, R. F., "A Note on the Convexity of the Problem of Siting Depots," The International Journal of Production Research, 6, 153-154 (1967). [6] Love, R. F., and J. G. Morris, "Modelling Inter-City Road Distances by Mathematical Func- tions," Operational Research Quarterly, 23, 61-71 (1972). AN EXAMINATION OF THE EFFECTS OF THE CRITERION FUNCTIONAL ON OPTIMAL FIRE-SUPPORT POLICIES* James G. Taylor and Gerald G. Brown Naval Pbstgraduate School Monterey, California ABSTRACT This paper examines the dependence of the structure of optimal time-sequential fire-support policies on the quantification of military objectives by considering four specific problems, each corresponding to a different quantification of objec- tives (i.e. criterion functional). We consider the optimal time-sequential allocation of supporting fires during the "approach to contact" of friendly infantry against enemy defensive positions. The combat dynamics are modelled by deterministic Lanchester-type equations of warfare, and the optimal fire-support policy for each one-sided combat optimization pioblem is developed via optimal control theory. The problems are all nonconvex, and local optima are a particular difficulty in one of them. For the same combat dynamics, the splitting of supporting files between two enemy forces in any optimal policy (i.e. the optimality of singular subarcs) is shown to depend only on whether the terminal payoff reflects the objective of attaining an "overall" military advantage or a "local" one. Additionally, switching times for changes in the ranking of target priorities are shown to be different (some- times significantly) when the decision criterion is the difference and the ratio of the military worths (computed accoiding to linear utilities) of total infantry survivors and also the difference and the ratio of the military worths of the combatants' total infantry losses. Thus, the optimal fire-support policy for this attack scenario is shown to be significantly influenced by the quantification of military objectives. 1. INTRODUCTION As one of the authors has pointed out in [40], for the purposes of miUtary operations research it is convenient to consider that there are three essential parts of any time-sequential combat optimization problem : (a) the decision criteria (for both combatants), (b) the model of conflict termination (and/or unit breakpoints), (c) the model of combat dynamics. An important problem of military operations research is the determination of the relationship between the nature of system objectives and the structure of optimal combat strategies. Of particu- *This research was partially supported by the Office of Naval Research. The authors wish to thank the referee for his helpful suggestions. A slightly expanded version of this paper (with numerous annotations elaborating upon various points) has appeared in report form as [45]. 183 184 J- G. TAYLOR & G. G. BROWN lar importance is the sensitivity of the structure of optimal combat strategies to the nature of military objectives (see [29] for a discussion of the influences of political objectives on military ob- jectives for the evaluation of (timesequential) combat strategies). In a time-sequential combat optimization problem the combatant objectives are quantified through the criterion functional [8]. If the optimal combat strategy and associated payoff are quite sensitive to the functional form of the criterion functional, then care must be exercised in the selection of the functional form. An important constituent part of fire support is the target allocation function which matches a specific weapon type with an acquired target within the target's environment [25]. It is not sur- prising then that the determination of optimal target allocation strageties for supporting weapon systems [48] is (in one form or another) one of the most extensively studied problems in both the open literature [42, 43] and also classified sources. During World War II the problem of the ap- propriate mixture of tactical and strategic air forces (another aspect of the optimal fire-support strategy problem) was extensively debated by experts. Some analysis details are to be found in the classic book by Moise and Kimball (see pp. 73-77 of [27]). This problem was further studied at RAND in the late 1940's and early I950's [13] and elsewhere [3]. It would probably not be too far- fetched to say that this problem stimulated early research on both dynamic programming [4] and also differential games [13, 18]. Today the problem of the determination of optimal air-war strate- gies (another aspect of the fire-support problem) is being rather extensively studied by a number of organizations [1, 2, 5, 6, 10, 12, 14, 22, 46]. Thus, the objective of this investigation is to determine the sensitivity of the optimal time- sequential fire-support policy to the functional form of the criterion functional. Our research approach is to combine Lanchester-type models of warfare (see, for example, [34, 38, 40] and refer- ences contained therein) with generalized control theory [15, 16] (i.e. optimization theory for dynamic systems). This general research program has been described in more detail elsewhere [40, 41]. It seems appropriate to examine sensitivity of the optimal policy by considering a concrete problem. Consequently, our research approach is to consider several different criterion functionals for the same tactical situation involving the allocation of supporting fires. The tactical situation that we have chosen to examine is the "approach to contact" during an assault on enemy defensive positions by friendly ground forces. We seek to determine the "best" allocation for the supporting fires of the friendly forces. Weiss [48] has emphasized that a simplified model of a combat situation is particularly valuable when it leads to a clearer understanding of significant relationships which would tend to be obscured in a more complex model. Consequently, we will consider a mathemati- cally tractable version of this problem so that we can make quantitative comparisons among the optimal policies corresponding to the various criterion functionals. Corresponding to each different criterion functional is a different optimization (here, optimal control) problem. Each of these problems has been solved, and the corresponding optimal fire-support policies will be contrasted. In this paper four different criterion functionals are considered: it is shown that both the difference and the ratio of military worths of friendly and enemy survivors (computed according to linear utilities) and also the ratio of the military worths of friendly and enemy losses as criterion functionals may lead to exactly the same optimal policy. A completely different optimal policy, however, is obtained for the weighted average of force ratios of opposing infantry (at the time that the supporting fires are lifted) as the criterion functional. We have decided that the three former criterion functionals (i.e. the difference and the ratio of the military worths of survivors and the OPTIMAL FIRE-SUPPORT POLICIES 185 ratio of the military worths of losses) are appropriate for an "attrition" objective,* whereas the weighted average of force ratios is appropriate for a "breakthrough" objective, t (In the latter case, the attacking force tries to overpower the defenders at one place along a front and then pour reinforcements through the break in the defender's defenses in order to "penetrate" behind the enemy lines and, for example, disrupt enemy command, control, and communications.) The body of this paper is organized in the following fashion. First, we review previous work on the relationship between the quantification of military objectives and the structure of optimal time-sequential fire-distribution policies in order to place the work at hand in proper perspective. Then we describe the fire-support problem and discuss the four criterion functionals that will be used to determine optimal fire-support policies. Each of these criterion functionals represents a different quantification of military objectives, and all appear to be reasonable criteria. Next, the optimal fire-support policies are described for the four problems. The structures of the four optimal policies are then contrasted. Next, we justify the optimization results that we have been discussing by sketching their development via modern optimal control theory. This development is given for each of the four problems. Finally, we discuss what we have learned from our investigation of the dependence of the structure of optimal fire-support policies on the quantification of military objectives. 2. PREVIOUS WORK ON THE STRUCTURE OF OPTIMAL FIRE-DISTRIBUTION POLICIES The only systematic examinations of the influences of the nature of the criterion function on the structure of optimal time-sequential fire-distribution strategies known to the authors are those of Taylor [30-33, 35, 37, 40, 43]. In [31] and [40], however, the influences of the nature of the target- type attrition process on the structure of optimal fire-distribution policies were examined. In [30-33] and [40] a linear utilityj was assumed for the military worth of the number of each surviving weapon system type, and the criterion functional (payoff) was taken to be the net military worth of survivors (i.e. the difference between the military worths of friendly and enemy forces). Taylor [30-33, 40] has studied how the optimal fire-distribution policy depends on the assignment of these linear utilities. In other words, he examined the sensitivity of the optimal combat policy to para- metric variations in the assigned linear utilities for survivors. It has been shown that the n-versus- one fire-distribution problems studied in [30-33] all have quite simple solutions when enemy survivors are valued in direct proportion to their kill capabilities (as measured by their Lanchester attrition-rate coeflBcients [34, 38] against the (homogeneous) friendly forces). Pugh and Mayberry [29] have suggested that a appropriate payoff, or objective function (in our terminology, criterion functional) for the quantitative evaluation of combat strategies is the loss ratio (calculated possibly using weighting factors for heterogeneous forces) . They have stated *In other words, the friendly forces seek an "overall" military advantage. fin other words, the friendly forces seek a "local" military advantage. I See [17] for the methodology for the development of these linear utilities. For optimal control differential game combat optimization problems, the assumption of linear utilities yields that the boundary conditions for the adjoint- variables (at least when no terminal state constraint is active) are independent of the values of the state variables. Serious computational difficulties may arise when nonlinear utilities are assumed. The effects of assuming nonlinear utilities for military resources upon the evaluation of time-sequential combat strategies has apparently never been studied. 186 J- G. TAYLOR & G. G. BROWN [29] that an "almost equivalent" criterion is the loss difference. However, Pugh and Mayberry [29] do not explore the consequences of various functional forms for the criterion functional. In this paper we will examine to what extent these criteria are, in fact, equivalent. In combat problems with either no replacements or a fixed-length planning horizon, it is readily seen that minimizing the loss difference is the same as maximizing the difference in survivors. It is such a case of no replacements that we will examine here. It remains to determine the "equivalence" of minimizing the loss ratio to maximizing the ratio of survivors and to relate these results to those for maxi- mizing the difference in survivors. Furthermore, for the evaluation of combat strategies it is of interest to consider the military w^orth (i.e. utility of military resources) of survivors. In almost all* the work that has appeared in the open literature [41], a linear utility has been assumed for valuation of survivors, and some form of net military worth (i.e. the difference between the military worths of friendly and enemy sur- vivors) has been taken as the payoff (i.e. criterion functional) [26, 30-33, 35, 40, 41]. One reason for assuming such linear utilities is that of mathematical tractability : the boundary conditions for the dual variables do not depend on the state variable values (at least when no terminal constraint involving the state variables is active) . The only study known to the authors of the consequences on nonlinear utilities for survivors is contained in [37], where Kawara's supporting weapon system game [20] is examined. Taylor [37] has determined (at least for the case in which the appropriate side's (in Kawara's case, the defender) supporting weapon system is not annihilated) the most general form of the criterion functional which leads to optimal fire-support strategies being independent of force levels, and he has shown that the criterion functional chosen by Kawara [20] is a special case of this form. In other words, Taylor has shown that Kawara's conclusion [20] that optimal fire-support strategies do not depend on force levels only applies to problems with the special type of criterion functional used by Kawara and is not true in general. No other examination of the dependence of optimal combat strategies on combatant objectives is known to the authors. 3. COMPARISON OF OPTIMAL FIRE-SUPPORT POLICIES In this section we give the fire-support allocation problem for which the optimal policy is developed according to four different criterion functions. These fire-support policies are then compared. 3.1. The Fire- Support Problem Let us consider the attack of heterogeneous X forces against the static defense of heterogeneous Y forces along a "front." Each side is composed of primary units (or infantry) and fire-support units (or artillery). The X infantry (denoted as ^i and X2) launches an attack against the positions held by the F infantry (denoted as Yi and F2). We may consider Xi and X^ to be infantry units ♦The only exceptions known to the authors are the papers by Chattopadhyay [9] and Kawara [20]. For example, in Kawara's paper [20] the payoff is the ratio of opposing infantry strengths (measured in terms of total numbers) at the "end of battle" (see also the differential game studied in Appendix D of [43]). OPTIMAL FIRE-SUPPORT POLICIES 187 operating on spatially separated pieces of terrain. We assume that the Xi infantry unit attacks the Yi infantry unit and similarly for X2 and F2 with no "crossfire." (e.g. the Xi infantry is not attrited by the Y2 infantry). We will consider only the "approach to contact" phase of the battle. This is the time from the initiation of the advance of the Xi and X2 forces towards the Yi and Y2 defensive positions until the Xi and X2 forces actually make contact with the enemy infantry in "hand-to-hand" combat. It is assumed that this time is fixed and known to X. The Xt forces begin their advance against the Yt forces from a distance and move towards the Yi position. The objective of the Xi forces during the "approach to contact" is to close with the enemy position as rapidly as possible. Accordingly, small arms fire by the Xi forces is held at a minimum or firing is done "on the move" to facilitate rapid movement. It is not unreasonable, therefore, to assume that the effectiveness of Xt force "on the move" is negligible against Yi. It may be shown that such an approximation is necessary for reasons of mathematical tractability in the fire-support optimal control problem to be subsequently given. See the Appendix for further details. We assume, moreover, that the defensive F, fire (for i=l, 2) causes attrition to the advanc- ing Xi forces in their "field of fire" at a rate proportional to only the number of Yi firers. Let a< denote the constant of proportionality. It is convenient to refer to the attrition of a target type as being a "square-law" process when the casualty rate is proportional to the number of enemy firers only and as being a "linear-law" process when it is proportional to the product of the number of enemy firers and remaining targets [31-33]. Brackney [7] has hypothesized that a "square-law" attrition process occurs when the time to acquire targets is negligible in comparison with the time to destroy them. He has pointed out that such a situation is to be expected to occur when one force assaults another. Additionally, we assume that either the Y forces have no fire-support units or their fire support is "organic" to the Y units (i.e. fire-support units are integrated with Yi and only those with Yi support Yi). During the "approach to contact" the X fire-support units (denoted as W) deliver "area fire" against the Yi forces. In other words, we assume that -X^'s fire-support units fire into the (constant) area containing the enemy's infantry without feedback as to the destructiveness of this fire. Let tpi denote the fraction of the TF fire-support units which fire at F,. (We then have that <^i-t-<^2=l and </><>0 for i=l, 2.) Then for constant <^< there are a constant number of fire-support units firing at Yi, since we assume that the W fire-support units are not in the combat zone and do not suffer attrition. In this case, the F^ attrition rate is proportional to the Yi force level [19, 47]. Let d denote the corresponding constant of proportionality. This combat situation is shown diagram- matically in Figure 1. It is the objective of the X forces to utilize their fire-support units (denoted as W) over time in such a manner so as to achieve the "most favorable" situation at the end of the "approach to contact", at which time the force separations between opposing infantries are zero and artillery fires must be lifted from the enemy's positions in order not to also kill friendly forces. The "out- come" of this phase of battle may be measured in several different ways and is quantitatively expressed through the criterion functional (denoted as J). Thus, we have the following optimal control problem for the determination of the optimal fire-support allocation policy (denoted as 188 J. G. TAYLOR & G. G. BROWN Figure 1. Diagram of fire-support problem considered for examination of effect of criterion functional on optimal fire-support policy. 0*(<) for 0<t<T, where T denotes the time of the end of the "approach to contact") for the W fire-support units : (1) maximize J, <t>i (0 with stopping rule : tf—T=0, dxi subject to : (battle dynamics) dt dVi dt = —aiyi, ■■—<i)iCiyi fori=l,2, with initial conditions : Xi(0)=x/ and yiiO)=y/ for i=l, 2, and xi, X2yi, y2>0 (State Variable Inequality Constraints), 01+02=1 and 0i>O for 1=1, 2 (Control Variable Inequality Constraints), where J denotes the criterion functional, Xi{t) denotes the number of Xi infantry at time t, similarly for y,(0, OPTIMAL FIRE-SUPPORT POLICIES 189 ttj is a constant (Lanchester) attrition-rate coefficient (reflecting the effectiveness of Yt fire against Xi), Ci is a constant (Lanchester) attrition-rate coefficient (reflecting the effectiveness of W^ sup- porting fires against F<) , tf (with numerical value T) denotes the end of the optimal control problem, and 4>i denotes the fraction of PTfire support directed at Fj. It will be convenient to consider the single control variable 4> defined by (2) «^=<^i so that </>2=(l — 0) and 0<<^<1. For r<-|-oo it follows that yi{t)'^0 for Q<t<T. Thus, the only state variable inequality constraints (SVIC's) that must be considered are x<>0. However, let us further assume that the attacker's infantry force levels are never reduced to zero. This assumption applies to all feasible solu- tions and may be militarily justified on the grounds that X would not attack the Yt positions if his attacking Xt forces could not survive the "approach to contact." 3.2. The Criterion Functionals Considered The four criterion functionals for which the optimal fire-support allocation policies will be compared are given in Table 1 . All are functions only of the various numbers of combatants at the end of the planning horizon (i.e. at the end of the "approach to contact" at which time the support- ing fires must be lifted for safety reasons) . Table I. Summary of problems considered to study ejffect of criterion functional on optimal fire- support policy. Problem Criterion Functional, J 1 j^,a,x,{T)ly,{T) k=l 2 j^,v,XK{T)-i:,w,y,{T) k=\ k=\ 3 \hv,x,{T)]^l^^^w,y,{T)]^ 4 -\JZ^v,{x,''-x,{T))\l\^^w,{y,''-ydT))\ The criterion functional for Problem 1 (i.e. Ji='El^iakXt(T) /yi,{T)) represents a weighted average of the force ratios of opposing numbers of infantry in the two infantry combat zones. The rationale behind this choice is that, in each combat area, (i.e. the area of combat between Xt and F() combat (possibly hand-to-hand) between the Xt and Yt forces will follow the "approach to contact" and the (initial) force ratio will be related to the outcome of this subsequent combat 190 J. G. TAYLOR & G. G. BROWN action. The weighting factors (i.e. a* for k=l, 2) allow one to assign relative weights to this subse- quent combat between Xi and Yi in the two combat areas. The criterion functional for Problem 2 (i.e. J2='^l = iVkXk{T)~'El = i'Wicyic(T)) represents the difference between the military worths (computed using linear utilities) of the surviving X and Y forces at the end of the "approach to contact." As noted above in Section 2, we observe that maxi- mizing the difference in worth of survivors is the same as minimizing the loss difference in combat problems (such as the one at hand) with no replacements. The criterion functional for Problem 3 (i.e. J3={'2l^iVkXk{T)}/{'Zl.iWicykiT)}) represents the ratio of total military worths of the surviving X and Y forces, whereas the one for Problem 4 (i.e. Ji= — {^l = iViciXk''—Xic{T))}/ {'^\ = iWk{yk—yk{T))}) represents the ratio of military worths of losses. Both the loss ratio and the loss difference have been proposed by Pugh and Mayberry [29] as appropriate payoffs for the evaluation of combat strategies. They state that (see p. 869 of [29]) "when the most straightforward estimate of a weighting factor for the loss difference is used, the two criteria are almost equivalent." From the study at hand, we will see that a similar statement is true: the two criteria are equivalent for a certain "natural" valuation of forces (see next section), but otherwise they may yield slightly different optimal fire-support policies. 3.3. Optimal Fire-Support Policies In this section we give the optimal time-sequential fire-support policies for the four problems presented in the previous section. As discussed above, each of these problems corresponds to a different decision criterion for the attackers, with all other aspects of the problem (i.e. combat dynamics and length of the planning horizon) being the same in all problems. In all cases we assume that neither of the attacking infantry forces can be reduced to a zero force level during the approach to contact, i.e. problem parameters and initial force levels are such that Xi{T)^0 for i=l, 2. For Problem 1 with Ji='El^iakXk{T)/yk{T), the optimal (open-loop) fire-support policy is 1 for < ^ < T when F, {r^", T)>Fi {r^", T) , (3) 4>*{t;r,\r,'>,T) = 0(orO<t<T when Fi {r^", T) < F^ {r^", T) , where ri{t)=ri=Xi/yi, r<''=ri(0), and (4) F,(r.: r)=„..,o.{(^)(^)-i, (e-r-l-c.T)\ For problem 2 with J2=2* = i2^*Xt(r)-2| = ,w*i/t(r) and Problem 3 with J3={ 2| = i«*Xt(r)}/ { Xl^iWkyk{T) }, we make the nonrestrictive assumption that (5) w. W2 -^> — a^Vi a^Vi Then the optimal (closed-loop) fire-support policy may be expressed in the same form for Problems 2 and 3. It is best explained by considering that the battle is divided into two phases, denoted as Phase I and Phase II. During Phase I for 0<t'Cti=T—Tiiy/jy2^), the optimal policy is (6) <t>*it,x,y) = 1 for pyozCiVil {axCiVi) , C2KC1+C2) ior p=a2C2V2/{aiCiVi), ioT p<Ca2C2V2/iaiCiVi), Ts for p^>ps^, Ti=ri(y//y/)=- T<t> for Pi <p^<Ps^ .0 for p^<Pl, OPTIMAL FIRE-SUPPORT POLICIES 191 where (7) p=yi/y2; while during Phase 11 for T—n (y/ly2^)<t< T, the optimal policy is (8) <f{t,x,y) = l, where (9) (10) ,,^.(^-^)(^)/(^), Xa^CxvJ \a2V2J I Ka^ViJ and ps'^ denotes the final ratio (1/1/1/2)^ such that as one works backwards from this end point, the optimal path leads "directly to" the "singular surface" after requiring use of the policy (8) for a finite interval of time. We will examine below how t, depends on whether or not inequality holds in (5). Furthermore, ts is the unique non-negative root of 7^(rs)=0, where the fimction F{t) is different for Problems 2 and 3 and is given below. For PL<ip^<C.Ps^, t^ is the smaller of the two positive roots of G{t^;p^)=0, where the function G{t;p^) is also different for Problems 2 and 3 and is given below. Ts and r^ may be called switching times. It has been shown that (a) bounds on t^ are given by 0<t^<^ts, (b) T4, is a strictly increasing function of p^ for pl<p^<^Ps^, and (c) there is no root to G{t^; p^) for p^^ps^. For Problem 2, we have (11) F{r) = r^(^-^) e--^-(^-^\ and (12) G{r; pO = ^ (e--l) (^^) p'-r+(^-^) (^) P^-(^\ Ci \a2C2V2/ \a2C2V2/ \aiVi/ \a2V2/ Bounds on ts in Problem 2 are given by (a) for Wil{fl,iVy}<\lci, (13) J^_!^<,,<i|l_(j^)/(j^)},and «!«! a2V2~ Ci{ \a2V2/ 1 \aiVi/ ] (b) for l/ci<Wi/(aiVi), (14) lfl_fj^yfj^M<,,<J^_J^. For Problem 3, we have (15) ir(,) = ,+A_!^t\ _,._/! _^,^^d ' Vci a^Vi / \Ci a2V2 J (16) Gir; pO = ^ (e--l) f^ ,f-,+j\(^^(2!h.\ pZ-f^)}. ^ ' ^ Ci ^ ' \a2C2V2/ \\(hC2V2/ \<llVj \<l2V2/\ Bounds on ts in Problem 3 are given by (a) for J3Wi/{aiVi)<l/ci, 192 J- G. TAYLOR & G. G. BROWN (17) J3f^-^)<r.<ifl-(^)/(^)Und ^ ' \a\Vx diViJ Ci[ \a2V2/ 1 ya^Vi/ ] (b) for l/ci<J3Wi/iaiVi), (18) Also for Problem 3, we have (19) dTsldJz>Q. The solution to Problem 4 is exactly like that to Problem 3 except that Jz in Problem 3 is replaced by (-J4). Let us now sketch the proofs of a few statements just made above about the switching times Ts and T<i>. The existence of a unique nonnegative root to F{ts)=0 for Wi/(aiVi) >W2/ (0^22^2) follows from 7^(0) <0 and F'{t)^0, V t>0. The existence of two positive roots to G(t^; pO=0 (here the second argument, p^, is a (fixed) parameter) for Wi/{aiVi)>W2/ia2V2) and pL<^p^<ips^ follows from 6r(0)>0 for p^^pl and the fact that (letting r denote the unique value of r at which the global minimum of the strictly convex function G(t) occurs) G{t; p^) = F(t)<^0 for p'<ps^. The latter is a consequence of dG/dp^^O and G(ts; psO=F(ts)—0. It should be noted that the fact that 6?' (r; pO = allows the parameter p^ to be eliminated from G(t; p'). It also follows that there is no solu- tion (i.e. value of r^) to G{t^; pO=0 for p'>ps'. The proof that dTsldJz= -idFldJ3)/{dF/dTs)>0 follows from dFjdrs^O and dF/dJ3<C0 (the latter holding since {exp ( -Cit) -1+Cir}>0). We will now illustrate the structure of the optimal fire-support policies for the first three problems by considering some numerical examples. The basic parameter set used in the numerical computations is shown in Table 2. Numerical results have not been obtained for Problem 4 when Wi/{aiVi)^W2/{a2V2) because of the difficulty in solving the associated two-point boundary-value problem. The structure of the optimal policy, however, is similar to that for Problems 2 and 3, although switching times are, in general, very difficult to determine. Table 2. Basic parameter set for numerical examples. i ai* ct T= 30 minutes 1 0.020 0.06 2 0.015 0.05 *o, has units of [X, casualties/ { (minute) X (number of Y,) } ]. t c. has units of [F, casualties/ { (minute) X (number of Fj)}]. For Problem 1 it is convenient to introduce the "local" force ratio [36, 39] ri=Xilyu which represents the ratio of the numbers of opposing infantry in each of the two combat areas (sec Figure 1). The optimal fire-support policy is most conveniently expressed as an open-loop control in terms of the two initial force ratios, denoted as ri''=r((0) for *=!, 2, and the given length of OPTIMAL FIRE-SUPPORT POLICIES 193 time for the approach to contact, T. Let us take ai = a<i=\.Q. Then the optimal fire-support policy is graphically depicted in Figure 2. In the initial force-ratio space, the line with equation (20) where tti R=aiaiCil {a2a2C2) , and is a "dispersal line" [18, 30, or 40] away from which all optimal battle trajectories flow. This is shown in Figure 3. In other words, the same return is obtained from using 0*=O or 1 all of the time This dispersal hne is determined by equating the extremal returns as a function of initial conditions for these two policies (see Section 4.3). In constructing Figure 3, we have used facts like the following: when <^=1 for Q<t<T and r2^=0, then (21) 2 5. OH 4.0- 3.0- 2.0- 1.0- For r°5.RY^rJ-pa2: ♦*(t) - for * t * o „ 2 o '^2"''Yi^'^l-»'*2 o *2 o For r * Ry — r - ya, : Z a. 1 z 4.*(t) - 1 for < t * T —I — 5.0 — T— 6.0 -T — 3.0 4.0 Figure 2. Optimal (open-loop) fire-support policy for Problem 1. 194 J. G. TAYLOR & G. G. BROWN ■>■ Figure 3. Optimal battle trajectories resulting from optimal (open-loop) fire-support policy for Problem 1. For Problems 2, 3, and 4, the optimal fire-support policy (expressed as a closed-loop control (see [16] or [35])) is most conveniently expressed in terms of yjyz (i.e. the ratio of the numerical strengths of the two defending infantry forces) and T=T—t (i.e. the "backwards" time or "time to go" in the approach to contact). When enemy forces are valued in direct proportion to the rate at which they destroy value of the friendly forces, i.e. (22) Wi=kaiVi fori=l, 2, OPTIMAL FIRE-SUPPORT POLICIES 195 the optimal fire-support policy takes a particularly simple form (denoted as policy A) : POLICY A: ForO<t<T (I for ?/i/i/2>a2C2t^2/(aiCit?i), (23) <t>*{t, X, y)-=l C2/(ci+C2) for yi/y2=a2C2V2/(aiCiVi), [o ioT yi/y2<,(hC2V2/{aiCiVi). This is shown pictorially in Figure 4 in which optimal trajectories are traced backwards in time. It is convenient to note that, for example, when (^(r)= CONSTANT for 0<r<<7, we have p(r)=p^exp {[<I>Ci—(1—(j>)c2]t}. In this case, ri = (see (9), (11), and (15) above), i.e. the entire approach to contact is "Phase I. CASE for v^/ia^v^) = v^/ia^v^) 30.0 24.0 18.0 12.0 Backwards Time, t (minutes) T - 0.0 (t«T) Figure 4. Diagram of optimal (closed-loop) fire-support policy (Policy A) for Problems 2, 3, and 4 when wJiaiVi) = Wil{a2V2). When enemy forces are not valued in direct proportion to the rate at which they destroy value of the friendly forces (without loss of generality we may assume that Wil(aiVi)'^W2/(a2V2)), the solu- tions to Problems 2 and 3 are considerably more complex, as shown in Figure 5. For constructing this figure, we have taken i;i=j;2=15.0, Wi=4.0, and W2=1.5, with other parameter values the same as shown in Table 2. As we have seen above, the planning horizon may be considered to consist of 196 J. G. TAYLOR & G. G. BROWN two phases (denoted as Phase I and as Phase II), during each of which a different fire-support allocation rule is optimal. We denote this overall optimal policy as Policy B (see (6) and (8)). During Phase I, Policy A is optimal; whereas during Phase II, it is optimal to always concentrate all artillery fire on Fi (which has been valued disproportionately high). y, CASE for w /(a V ) > \t / (a v ) . 18.0 Backwards Time. 12.0 6.0 (minutes) T- 0.0 (t-T) Figure 5. Diagram of optimal (closed-loop) fire-support policy (Policy B) for Pioblem 2 when u)i/(oiri)>u)2/(a2V2). (The structure of the optimal fire-support policy is similar for Problems 3 and 4.) The absence or presence of Phase II itself in the optimal time-sequential fire-support poUcy depends on the ratio of enemy infantry strengths p=y\ly2- For Problem 2, the length of Phase II (i.e. Ti) is independent of the final force levels of the attacking friendly infantry units (i.e. a;/ and X2O and depends only on p^=y/ly2^ and the combat effectiveness parameters (see equations (1)), whereas for Problem 3 the length of Phase II does depend directly on x/ and x/ through the criterion functional J3={ 2*_i%z/}/{2|_iWtt//}. Thus, we see that rj may be quite different for Problems 2 and 3 : for example, for the parameter set shown in Table 2 (plus force utility values tJi=!;2= 15.0, Wi=4.0, and W2=l.5, and terminal values a;/=X2^=200.0 and j/2^=50.0), we have Ts(Problem 2) = 7.93 minutes, while Ts(Problem 3) = 11.37 minutes. (For computing rs(Problem 3) by using F{t) given by (15), we have used the fact that ('"''^=i"'(s^:)-p (-'"») OPTIMAL FIRE-SUPPORT POLICIES 197 to eliminate y/from J^) Recalling (19) and observing that (24) jim^..(Pr„blem3) = (l/c,)l„{[-;-/(|||]}, we see that for this parameter set the largest that Ts(Problem 3) may be is lim rs(Problem 3) = .73-»+«> 11.55 minutes. Thus, for this parameter set, Ts(Problem 2) and rs(Problem 3) may differ by at most fifty percent. 3.4. Discussion of Comparison In this section we will contrast the structure of the optimal fire-support policies for the four problems considered above. Let us recall that in all cases we have assumed that x/, Xa'^^O. For Problem 1 the optimal fire-support policy is to always concentrate all artillery fire (i.e. supporting fires) on just one of the two opposing enemy infantry units. This policy will maximize the force ratio at the end of the approach to contact in one of the combat areas (i.e. x/jy/) and may be considered to be a "breakthrough" tactic. In other words, one concentrates all fire support on the key enemy unit in order to overwhelm it and effect a penetration. On the other hand, for Problems 2, 3, and 4 the optimal fire-support policy may involve splitting of fires between the two enemy troop concentrations. This property of the solution has been anticipated in Taylor's earlier work on the optimal control of "linear-law" Lanchester-type attrition processes [31, 32] (see also [43]). We may consider this policy to be an "attrition" tactic which aims to wear down the overall enemy strength. The sti'uctures of the optimal policies for Problems 2, 3, and 4 are similar, although the switching times (i.e. t^ and ts) may be appreciably different when enemy forces are not valued in direct proportion to the rate at which they destroy value of the friendly forces. In such a case we may assume without loss of generality that inequality holds in (5), i.e. (25) Wi/{aiVi)yw2/{(hV2)- The functional dependences of these switching times are also different in Problems 2, 3, and 4. For Problem 2 the switching times (i.e. the 0-transition surface) are independent of the force levels of the attacking friendly forces (i.e. Xi and X2), as is the optimal policy itself. For Problem 3 the switching times depend (see (15) and (16) above) on the ratio of military worths of surviving in- fantry forces (computed using linear utilities), i.e. J3= {'^Ifl^^iV^XkiT)} / {Xl^iW^ykiT)} . It has been shown (see Section 3.3 above) that 6rs/6J3>0 so that the larger J3 becomes, the more time is spent concentrating fire on Yi, although there is an upper limit to this time (see (7)). Similar results hold for Problem 4, only with J3 replaced by (— J4). For comparing the switching times between Problems 3 and 4, we note that J3X - J4) if and only if J3>{2^ = iW*a;/}/{2| = iWA?/A''}. The most significant thing to be noted in comparing the optimal fire-support policies* for these four problems is that the entire structure of the optimal policy may be changed merely by changing the criterion functional. In particular, singular subarcs (i.e. the splitting of W's fire *The referee has insightfully pointed out that it would be interesting to look at the payoff under one criterion corresponding to the optimal control under another criterion. However, this paper emphasizes the structure of the optimal policy rather than the effects on the payoff. 198 J- G. TAYLOR & G. G. BROWN between Fi and F2) do not appear in the solution to Problem 1 , even though the necessary condi- tions for optimality on singular subarcs are exactly the same in all four of these problems. Such singular subarcs are, of course, part of the solution for Problems 2, 3, and 4. 4. DEVELOPMENT OF OPTIMAL POLICY FOR PROBLEM 1 The optimal policy is developed by application of modern optimal control theory. For Problem 1 it is convenient to introduce the /orce ratio in the ith combat zone ri=Xi/yi. Then Problem 1 may be wiitten as 2 (26) maximize Xj«*^*(^) with 7" specified, subject to: --Tj = —ai-\-<t)iCirt for i=l, 2, ^1+02=1, </»i>0, and ri>0 fori=l,2, where we recall (2). We also recall that we have assumed that r<>0. 4.1. Necessary Conditions of Optimality The Hamiltonian [81 is given by (using (2)) (27) H=\,{-ai+<t>c,r^)+\2{-a2-\-{l-(f>)c2r2), so that the maximum principle yields the extremal control law 1 for 5-^(0 >0, (28) <l>*(t) = { where S^{t) denotes the (^-switching function defined by (29) 5^(0=CiXir,-C2X2r2. The adjoint system of equations (again using (2) for convenience) is given by (assuming that rt(T)>0) (30) i^=-^*c,\, with X,(r) = a, fori=l,2. Computing the first two time derivatives of the switching function (31) 'S'*(0 = —OiCiX, +0262X2, and S^(0=aiCiXi(ci<^)— 0202X2(02(1— 0)), we see that on a singular subarc (see [32] for a further discussion) we have [8, 21] (32) ri/ai=r2/a2, and 0101X1=0202X2, with the singular control given by (33) <l>s^C2/(ci+C2). OPTIMAL FIRE-SUPPORT POLICIES 199 On such a singular subarc the generalized Legendre-Clebsch condition is satisfied, since i^{w^(jl)]=''^'^^^^''+''^>^- 4.2. Synthesis of Extremals By an extremal we mean a trajectory on which the necessary conditions of optimality are satisfied. In synthesizing extremals by the usual backwards construction procedure (see, for example, [30] or [32]), it is convenient to introduce the "backwards" time defined by T=T—t. Rather than explicitly constructing extremals and determining domains of controllability [30, 35, 40], it is more convenient to show that the return (i.e. value of the criterion functional) corre- sponding to certain extremals dominates that from others. For this purpose it suffices to determine all possible types of extremal policies, as we will now do. To this end, we write (34) S^{0)=a2aiC2{Rn'lai-rifl(h), where (35) R=aiaiCil {a2(hC2)- Without loss generality we may assume that R>1. Then by (31) we have o (36) S4,{0)=aia]Ci—a2a2C2>0, o o where S^, denotes the "backwards" time derivative S^=dS^/dT. Considering (31), we may write (37) S^{T)=a2a2C2{RiXi/ar)-{\2/a2)}. It follows that S^(t)>0 and ,^*(r) = l, V r>0 when <S'^(0)>0 for R>1 (also when <S^(0)>0 for R=l). We also have <S«(r)<0 and <^*(r)=0, Vt>0 when 5'^(0)<0 (or R=l. There may be a change in the sign of S^{t), however, when S'^(0)<CO for i?>l. In this case </,*(r)=0 for 0<T<Ti and then (38) S^iT)=a2a2C2{RriiT)/ai—[exp (c2T)]r2(T)/a2}, where ti denotes the smallest value of t such that S^iri) =0. It is clear that we must have 5',^(ti) >0. o If 'S'«(ri)>0, then we have a transition surface, and from (38) we find that (39) i?ri(«,)/a,-[exp (C2r,)]r2(«i)/oj=0, where ti = T—Ti. From (37) we find that (40) 0<ri<(l/cj)lnij;. o If S^{ti)=0, the singular subarc may be entered, and then we have (41) Ti=(l/cj)lnij;. 200 J- G. TAYLOR & G. G. BROWN In this case we have (42) r2^=Rr/(i2/ai+FiR)(i2/c2, where r/=ri{T) and F{R) = \+R(\n R-\). We easily see that F(R)>0 for R>1. When R=l, we see that once the singular subarc is entered (in fowards time), it is never exited by an extremal trajectory. For the purposes of determining the optimal policy it suffices to consider the following four extremal policies for R>1: (43) Policy 0: <^*(0=0 for0<«<7', (44) Policy 1: 0*(O = 1 for0<<<7', 1 forO<«T-T,, (45) Policy 5-5: <l>*{t)^' ioTT-Ti<t<T, where 0<ti<(1/c2) Ini?, and (46) Policy S: if>*(t)-- Ci/iCi+Ci) {oTO<t<T-Ti, .0 ioTT-Ti<t<T, where ti=(1/c2) Ini? and Vi" I ai=ri I (h- The only extremal policies that are omitted here are those corresponding to extremals which contain a singular subarc but riVfli 7^r2''/a2. It is readily seen from (34) that Policy yields Rr//ai>r2^/a2, etc. We also note that corresponding to the bang-bang policy (45) we have n(<i) = {(ciri<'-ai) exp (ci<,)-}-a,}/c,, (47) r2{ti)=r2''—(hti>0. 4.3. Determination of the Optimal Fire-Support Policy It has not been possible to establish the optimality of a policy by citing one of the many sets of sufficient conditions that are available [8, 32, 35]. In particular, although the planning horizon for the problem at hand is of fixed length, one cannot invoke the sufficient conditions based on convexity of Mangasarian [24] or Funk and Gilbert [11] because the right-hand sides of the differen- tial equations (9) are not concave functions of rj and <(>{. As we have discussed elsewhere [31-33, 35, 40], however, the optimality of an extremal trajectory may be proven via citing the appropriate existence theorem for an optimal control ; for the problem at hand there are two further subcases : (1) if the extremal is unique, then it is optimal, or (2) if the extremal is not unique and only a finite number exist, then the optimal trajectory is determined by considering the finite number of cor- responding values of the criterion functional. The existence of a, measurable optimal control follows by Corollary 2 on p. 262 of [23]. In Sections 4.1 and 4.2 above, we have considered necessary con- ditions of optimality for piecewise continuous controls (see p. 10 and pp. 20-21 of [28]). It remains to show that the measurable optimal control may be taken to be piecewise continuous. This asser- tion may be proved by observing that if we consider the maximum principle for measurable con- trols (see p. 81 of [28]) in the backwards synthesis of extremals, then the optimal control may be taken to be piecewise constant (and hence piecewise continuous). This last assertion follows from the control variable appearing linearly in the Hamiltonian (27) , the control variable space being OPTIMAL nRE-SUPPORT POLICIES 201 compact, and the switching function (29) being continuous for 0<t<T. The maximum principle (also singular control considerations) then yields that the optimal control must be piecewise con- stant almost everjTvhere, since S^(t) can change sign at most once. Hence, it may be considered to be piecewise constant (see p. 130 of [28]). (The authors wish to thank J. Wingate of the Naval Surface Weapons Center, White Oak, for generously pointing out this type of argument.) We will now show that the optimal control must be constant. By the principle of optimality [8] it suffices for the purpose of showing that a singular solution is always nonoptimal to consider a singular extremal which begins with a singular subarc. We will show that the returns from both Policy B-B and also Policy S for a given point in the initial state space are dominated by the return corresponding to a constant extremal control. We denote the value of the criterion functional corresponding to Policy as Jo, that corresponding to Policy B-B as Jb, etc. Then we have (48) J.=<,=a,o, {(^) f - (1°) i exp (o.r)-[f^+^ (exp feT^-l)]' (49) J,=<.,a,e, {(^)f exp (o,T)+(g) i-[« (exp (cD-D+^j. (50) JB=a2(hC2 (^J - exp ici[T-Ti])+(^j - exp (c^t,) -f2[exp {c,[T-Tr])-l+C,u]-^,[{l + C2[T-rr]) exp (Car,)-!]}' and (51) j,=«,a2cJ(^)^exp(i^r)-[|-2(^-''exp(ii:20-l)+^^i?lni2+^(i?-l)] where a=C2/(ci+C2), a+(8=l, and K=CiC2/{ci-hc2). It is convenient to define At7i_o=«7i— Jo, etc., and then (52) Aj,.,=a2a2C2lR [(g) ( ^""^ ^c.^'^ "^ (®^P (ciT)-l-c,T)^ (53) AJ,.B=a2a2C2 [r [(g) ( exp(c.r)-exp(cjr-r.]) >^_i, ^^^ ^,^T)-exp {cdT-u])-c,u)] -[(g) (^5^^^^)-^ (exp {c2u)+C2[T-u] exp (c^rO-l-c^T)]}, and (54) AJ,.s=(^2(hC2 {(g) [^ {R exp (cir)-l)--^ {R" exp (ii:r)-l)] +|, (i?-^ exp (KT)-l-^)-|(exp (c.T)-l-^£)+J^i? Ini^+i, (i^-l) }. In computing AJi^s we assume that ri'/ai=r2"/a2. We now state and prove Lemma 1. LEMMA 1: Assume that R>1 and T>ti. If AJi_o>0, then AJ,_b>0. 202 J- G. TAYLOR & G. G. BROWN PROOF: (a) We consider for t>T, Then A Ji_o>0 <=) F(t,) >0. (b) We compute that F'(f)=i2{exp (ciO-exp (c,[i-r.])}{(^)-i (1-exp (-c,r,))}+^ (exp (c^t,)-!). (c) If Cir,<'<ai, then drjdt{t)<0 for 0<i<<i so that {ri''/ai)>{ri{ti) /ai)>Ti. It follows that F' (0>0. If Cir,<'>ai, then F' (0>0. Thus, we already have F' {t)>0 for «>r,. (d) By (a) and (c), we have F{t)>0, whence follows the lemma. Q.E.D. LEMMA 2: Assume that ^>1. Then for ti=T—Ti>0, we have AJo-b>0 with AJo-fl>0 for ^i>0. PROOF: (a) We consider for fi>0 +exp fe„) {(^) (HP(5^')-1 (exp fe«,)-l-.,<,) }• We observe that F(0)=0. (b) We compute that F'{h) = -^\y[{c,r,"-a,) exp (c^^O + a.] )+ ^^P j^'^'^ {r/exp (c^fj)-^ (exp {c,h)-l)]- di [Ci J 02 [ Ca J Considering (22) and (30), we find that for <,>0 we have F'{ti)=exp idTi) (^J (exp (C2<i)-1)-- (exp (c2<i)-l-C2«i) • (c) Recalling (47) that r2Va2><i, we have for <i>0 F'(<i)>exp (C2T,){<, (exp (C2<i)-1)-- (exp (c2«i)-l-C2«i)} >0, since for t>0 we have ^(0>0, where ^(0 = < (exp (czt) — 1) — (exp ic2t) — l—C2t)/c2. The latter result follows from g{0)=0 and 5f'(0>0,V«>0. (d) Thus, Fiti)>0,yti>0, whence follows the lemma. Q.E.D. As an immediate consequence of Lemmas 1 and 2 and analogous results for R<1, we have Theorem 1. THEOREM 1: For T'>t,>0, we have max (Jo, Ji)'>Jb with strict inequality holding for We next consider Lemma 3. LEMMA 3: Assume that R>1 and T>Ty. Then we have AJi_5>0 with AJi_s>0 fori2>l or T>Tu OPTIMAL FIRE-SUPPORT POLICIES 203 PROOF: (a) We consider for t>0 F{t) = t{ (R exp (ciO-l)/ci-(/2" exp Kt-1)/K} +R{R-p exp {KT)-1-Kt/R)/K^ -R (exp {c^t)-l-CiT/R)/ci'+{R \nR)/{ciCi) + {R-l)/c2'. Then we have FiO)=RiR-»-l)/K'+{R\nR)/ic^C2) + iR-l)/c2'=fiR)>0, with/(i?)>0 for R>1. The latter result follows from/(l)=/'(l) = and /"(i?) = (l-i?-^)/(c,C2i?) >0, V R>1. (b) Computing F'(0=J?''^{i?^ exp (ci<)-exp {Kt)]>R''t {exp (ciQ-exp {Kt)]>0 ior R>\ and ^>0, we see from (a) that F{t; R) >0 with F{t; R)>0 for jR>l or i>0. (c) We now consider G{t) = {R exp (ciO-l}/ci— (i?" exp (Kt)-1}/K. It follows that GiO) = l/c2+R/ci-R''/K=g{R)>0, since £/(l) = and g'{R) = {l-R-»)/c,. Also, G'(t) = R''{RP ex^{ Cit) -exp (KO) >0. Hence, 6^(0 >0. (d) Recalling that r,'>lai>T, we have by (c) that AJi-s>a2a2C2F{T; R)>0 with F{T; R)'>0 ioT R>1 or T>Ti. Q.E.D. From Lemma 3 and the analogous result for R<1, Theorem 2 follows. THEOREM 2: Assume that T>ti. Then max (Jo, Ji) > Jswith inequaUty holding for T>ti. Thus, we see from Theorems 1 and 2 that the optimal control must be constant and equal to either or I for 0<t< T. The results given in Section 3.3 (see, in particular. Figures 2 and 3) then follow from consideration of Ae7i_o (see (52)). 5. DEVELOPMENT OF OPTIMAL POLICY FOR PROBLEM 2 In this case we consider (1) with the criterion function fc=l k=\ Thus, for this problem the state space (considering time to be an additional state variable) is five- dimensional. 5.1. Necessary Conditions of Optimality The Hamiltonian [8] is given by (using (2)) 2 (55) H= — S PidiVi — 2i</>ci 2/1 —22 (1 —4>) C22/2, ! = 1 SO that the maximum principle yields the extremal control law 1 (orS^{t)>0, (56) <f>*it)=, 204 J. G. TAYLOR & G. G. BROWN where S<i>{t) denotes the 0-switching function defined by (57) S^it)=Ci{—qi)yi—C2(—q2)y2- The adjoint system of equations (again using (2) for convenience) is given by (assuming that x,(r)>o) (58) Pi{t) = Vi for 0<t<Tvni\i i=\, 2, and q_i = aiVi-{-<i>*Ciqi with qi{T)= — Wi for 1=1, 2. Computing the first two time derivatives of the switching function (59) S^it)='-aiCiVxyi + (i2C2V2y2, and <S'V(0 = aiCi?^ii/i(ci</.) — 02021^22/2(02(1— <^)), we see that on a singular suharc we have [8, 21] (60) yi/y2=a2C2V2/{aiCiVi), and (—qi)/iaiVi) = {—q2)/{a2V2), with the singular control given by (61) <l>s= 02/(01 + 02). On such a singular subarc the generalized Legendre-Clebsch condition is satisfied, since For Problem 1 it was convenient to consider a "reduced" state space consisting of t, ri = Xi/yi, and r2, while for Problem 2 we are considering the "full" state space of t, Xi, X2, yi, and ?/2- It seems appropriate to point out the corresponding relation between the adjoint variables in these two state spaces. This relation is easily seen by considering the optimal return function [8], denoted as W, and the following transformation of variables : (62) t=t &nd rt=Xi/yi (or 1=1,2. Then we have, for example, , , dW dWdr^ ^'*-^^ dx^it) dr.dx/ so that we obtain (63) 'pi=\i/yi and qt=—rt\ilyi for i=l, 2. Let us also note that, alternatively. Problem 1 could have been solved in the "full" state space of t, Xi, X2, 2/1, and 1/2, while Problem 2 cannot be solved in the "reduced" state space. The latter con- clusion follows from considering (58) and the requirement (see (63) above) that Pi/qi= — l/ri must hold for the transformation (62) to be applicable. 5.2. Synthesis of Extremals In synthesizing extremals by the usual backwards construction procedure it is convenient to consider OPTIMAL FIRE-SUPPORT POLICIES 205 (64) SAO)=a.c.v.y/(^)\^-^^-(^)l(^)] \(i\vj \a2C2V2yi' KaiVi/l \aiVi/\ and (65) S^{T)=aiCiViyi—a2C2V2y2, and o where t denotes the "backwards" time defined by T=T—t, and S^ denotes the "backwards" time derivative S^—dSJdr. We omit most of the tedious details of the synthesis of extremals because of similarity to those in [32]. Without loss of generality we may assume that (5) holds, and then there are two cases to be considered: (I) Wi/(ai 2^1) =^2/ (02^2), and (11) Wij{aiVi)'^W2l{a2V2)- CASE I: Wi/(aiVi)='W2/(a2V2); i.e. Wi=kaiVi for i=l, 2. In this case (64) becomes S4,{0)=(hC2V2y2^ {wj {aiVi)) {aiCiViyi^ / {a2C2V2y20 — l} , whence follows the synthesis of extremals shown in Figure 4. CASE II : Wi/(aiVi) ywz/ichVz) . In this case it follows from (56), (64), and (65) that for p' = y// 72' > (12^2^2/ (fliCii^i), we have S^(t)>0 and <^*(t) = 1 for all r>0. Since S^(0)<0=) S^{0)<0, it follows that for ^/ <hC2V2 \ / Wz \ 1/ Wi \ ~\aiCiViJ \(hV2/l \aivj we have 5'«(r)<0 and <^*(r)=0 for all t>0. There may be a change in the sign of S^{t), however, for C2W2/(ciWi)<p'<!tt2C2«'2/(aiCi«i). In this case (^*(t) = 1 for 0<r<ri and then (66) «.(.)=...,/ {i [e.p (e. ,-1, (|||) ^-^+{^) (^.) /-(^} It is clear that we must have <S'^(ti)<0. If S^{ti)<CO, then we have a transition surface with tj (denoted as t^) given by the smaller of the two positive roots of G{t^; p') = 0, where G{t; pO is given O by (12), If iS^(ti)=0, a singular subarc may be entered, and then we have that ti (denoted as ts) is given by the unique nonnegative root of F(ts)=0, where F{t) is given by (11). We denote the corresponding value of p^ as ps^. Then there is no switch in 0* for p''^ps^. We state this result as Theorem 3. THEOREM 3: <^*(t) = 1 for all r>0 when p'>ps^. PROOF: Immediate by G{ts] ps) = F{ts)=0 and dG/dp^yO, since then there is no solu- tion to Gin ; pO =0 for p^> ps'. Q.E.D. The bounds on ts given by (13) and (14) are developed as follows. First assume that Wi/iaiVi) < 1/ci and consider i^(T) = T+{l/ci—Wi/(ait;i)} exp (— Cjt)— {1/ci— W(a2«^2)}- Then CiWi/iaiVi)<F'(T) <1 and F"{t) >0 for Wi/{aiVi)<l/cu whence follow the boimds given by (13). Other developments are similar. The above information immediately leads to the extremal field shown in Figure 5 (see also Section 3.3). 206 J. G. TAYLOR & G. G. BROWN 5.3. Determination of the Optimal Fire-Support Policy The optimality of the extremal fire-support policy developed above follows according to the reasoning given in Section 4.3 by the uniqueness of extremals. 6. DEVELOPMENT OF OPTIMAL POLICY FOR PROBLEM 3 In this case we consider (1) with the criterion functional «/3={ i: v,x,{T) Wji: w,y,{T) 6.1. Necessary Conditions of Optimality The necessary conditions of optimality for Problem 3 are the same as those for Problem 2 except that the boundary conditions for the adjoint variables are different. Thus, (55) through (57) also apply to Problem 3. The adjoint system of equations (again using (2) for convenience) is given by (assuming that Xi(T)>0) (67) piit)=Vi/D {oTO<t<T with i=l, 2, and Qi=(iiPi+<l>*Ciqi with 2<(r) = — WjJa/Z) for i=l, 2, where Computing the first two time derivatives of the switching function (68) S^{t) = —aiCiPiyi+a2C2P2y2, and -S',^(O=aiCiiJi2/i(ci0)— 0202^22/2(02(1— «/»)), we find that (60) and (61) again hold on a singular subarc. On such a singular subarc the generalized Legendre-Clebsch condition is satisfied, since 6.2. Synthesis of Extremals The synthesis of extremals is essentially the same as for Problem 2 (see Section 5.2 above) except that we have «.(0)=.,..„,(^){(|£5M/)-(^j/(^^)}/z>, and (70) S4,{T) = {aiCiViyi—a2C2V2y2)/D. It follows that (71) «.(.)={...o,../(^)[(«i|M.:)-(||)/(^)]+j;,„,,„,,)_,,,,,„„,}/a OPTIMAL FIRE-SUPPORT POLICIES 207 6.3. Determination of the Optimal Fire-Support Policy As for Problem 2, the optimality of the extremal fire-support policy developed above follows according to the reasoning given in Section 4.3 by the uniqueness of extremals. 7. DEVELOPMENT OF OPTIMAL POLICY FOR PROBLEM 4 In this case we consider (1) with the criterion functional The necessary conditions of optimality for Problem 4 are the same as those for Problems 2 and 3, except that the boundary conditions for the adjoint variables are different : a,t t=T we have (72) p,{T)=vJD, and q,(T) = -w,i-J,)/D, iovi=l, 2, where Consequently, the solution to Problem 4 is exactly the same as that to Problem 3, except that J3 in the solution to Problem 3 is replaced by (—Ji). Because of the dependence of Ji on the initial force levels x/, y' for i=l, 2, the two-point boundary-value problem which arises in the determina- tion of switching times when (25) holds is very difficult to solve. 8. DISCUSSION In this section we discuss what we have learned about the dependence of the structure of optimal time-sequential fire-support policies on the quantification of military objectives. We studied this dependence by considering four specific problems (each corresponding to a different quantification of objectives, i.e. criterion functional) for which solutions were developed by modern optimal control theory. Our most significant finding is that essentially the entire structure of the optimal fire-support policy may be changed by modifying the quantification of military objectives. We feel that there are bajsically two types of military strategies: (1) to obtain a "local" advantage, and (2) to obtain an "overall" advantage. The criterion function for Problem 1 (i.e. Jx=i:a,xdT)/ydT), k = l a weighting of the final force ratios in the two separate combat areas) reflects the striving to attain a "local" advantage (referred to above as a "breakthrough" tactic). The corresponding optimal fire-support policy was to concentrate all supporting fires on one of the enemy units (the quantita- tive determination of this policy is given in Section 3.3) for the entire period of fire support. How- ever, we have assumed that the X commander has perfect information about the state variables (e.g. enemy force levels) and all Lanchester attrition-rate coefficients (i.e. system parameters). In the real world where this assumption may not hold, this policy need not be optimal. Other factors 208 J- G. TAYLOR & G. G. BROWN that would temper the use of such a policy in the real world are (1) the need to "pin down" enemy- forces with supporting fires (i.e. suppressive effects) , and (2) the giving of information to the enemy as to exactly where his defenses will be attacked by the concentration of preparatory fires only there. On the other hand, the criterion functionals for Problems 2, 3, and 4 reflect the striving to attain an "overall" advantage (referred to above as an "attrition" tactic which aims to wear down the overall enemy strength). The corresponding optimal fire-support policies for Problems 2, 3, and 4 were qualitatively the same and could involve a splitting oj supporting fires between the two enemy troop concentrations. This property of the optimal fire-distribution policy is not present in the solution to Problem 1 and was anticipated by our earlier work on optimal fire distribution against enemy target types which undergo attrition according to a "linear-law" process (see Section 3.1 above) [31, 32]. The criterion functional for this earlier work was the difference between the overall military worths of friendly and enemy survivors. Thus, we see that nonconcentration of fires on particular target types is characteristic of optimal time-sequential fire distribution over enemy target types which undergo attrition according to a "linear-law" process with the objective of attaining an "overall" advantage. We saw that the structures of the optimal fire-support policies for Problems 2, 3, and 4 were qualitatively similar. In fact, when one (i.e. the X commander) values enemy (i.e. Y) forces in each of the two combat zones in direct proportion to their rate (per unit of individual weapon system) of destroying the value of opposing friendly forces, the optimal policies were exactly the same for all three problems (see Section 3.3). In this case the optimal fire-support policy took the particularly simple form of Policy A as given by (6). When enemy survivors were not valued in direct proportion to their rate of destruction of friendly value, the optimal policies were different and more complex (see Section 3.3, in particular Figure 5) , and the planning horizon may be considered to be divided into two phases, denoted as Phase I and Phase II. The lengths of these two phases depended on different factors in these three problems, and the timing of changes in the allocation of supporting fires could be appreciably different. When the planning objective was the maximization of the difference in the total military worths of friendly and enemy forces at the end of the "approach to contact," the length of, for example. Phase II (during which all fire is concentrated on Fi) depended only on the attrition-rate coeflBcients and enemy force levels and was independent of the friendly attacking-force levels. When the ratio of the total worths of surviving friendly and enemy forces was considered (i.e. for Problem 3), the length of Phase II also depended directly on the attacking friendly force levels, while when the ratio of the total worths of friendly and enemy losses was considered, it also depended on the initial total worths of forces. Thus, we see that (at least for the relatively simple fire-support allocation problem considered here) the structure of the optimal time-sequential allocation policy may be strongly influenced by the quantification of military objectives. Moreover, the most important planning decision appar- ently is whether a side will seek to attain an "overall" advantage or a "local" advantage. We hope that our investigation has provided a better understanding of the dependence of the struc- ture of optimal fire-support strategies on combatant objectives. In conclusion, it appears to us that more such specific cases warrant investigation for developing a theory of optimal combat strategies. OPTIMAL FIRE-SUPPORT POLICIES 209 APPENDIX Fire-Support Allocation Problem in Which X^'s Fire on Yj is Not Neglected When X/s fire eflFectiveness against Yt is not assumed to be negligible, the fire-support allo- cation problem (1) considered in the main text becomes (A. 1) maximize J, 0.(0 with stopping rule : tf — T=0, dxi (battle dynamics) subject to : -jj = — atyt with ■j^= — btXi — <t>iCiyi fort=l, 2 xu X2, Vi, y2>0, 01 + 02 = 1, and </»i>0 for 1 = 1, 2. Unfortunately, the optimal policy to (A.l), for example, for the criterion functional Ji does not take a simple form at all [44]. It appears that without the approximation used in the main text (i.e. bi=0), it is essentially impossible to analytically develop deep insights into the structure of the optimal policy. See Appendix A in the report by Taylor [44] for further details. Since our goal has been to investigate the dependence of the optimal policy on the quantification of objectives (see also Taylor [37]), we have chosen to study the simpler problems. REFERENCES [1] Anderson, L. B., J. Bracken, J. Falk, J. Grotte, and E. Schwartz, "Oji the Use of Max-Min and Min-Max Strategies in Multistage Games and AT ACM," P-1197, Institute for Defense Analyses, Arlington, Virginia, (August 1976). [2] Anderson, L. B., J. Bracken, and E. Schwartz, "Revised OPTSA Model," P-1111, Institute for Defense Analyses, Arlington, Virginia, (September 1975). [3] Antosiewicz, H., "Analytic Study of War Games," Naval Research Logistics Quarterly 2, 181-208 (1955). [4] Bellman R., and Dreyfus, S., "On a Tactical Air-Warfare Model of Mengel," Operations Research 6, 65-78 (1958). [5] Bracken, J., "Two Optimal Sortie Allocation Models, Volume I: Methodology and Sample Results," P-992, Institute for Defense Analyses, Arlington, Virginia, (December 1973). [6] Bracken J., J. Falk, and A. Karr, "Two Models for Optimal Allocation of Aircraft Sorties," Operations Research 23, 979-995 (1975). [7] Brackney, H., "The Dynamics of Military Combat," Operations Research 7, 30^4 (1959). [8] Bryson, A., and Y. C. Ho, Applied Optimal Control (Blaisdell Publishing Company, Waltham, Massachusetts, (1969). [9] Chattopadhyay, R., "Differential Game Theoretic Analysis of a Problem of Warfare," Naval Research Logistics Quarterly 16, 435-441 (1969). 210 J- G. TAYLOR & G. G. BROWN [10] Fish, J., "ATACM:ACDA Tactical Air Campaign Model," ACDA/PAB-249, Ketron, Inc., Arlington, Virginia, (October 1975). [11] Funk, J., and E. Gilbert, "Some Sufficient Conditions for Optimality in Control Problems with State Space Constraints," SIAM Journal on Control 8, 498-504 (1970). [12] Galiano, R., and F. Miercort. "Results of a Survey of Tactical Air Campaign Models," Ketron, Inc., Arlington, Virginia, (November 1974). [13] Giamboni, L., A. Mengel, and R. Dishington, "Simplified Model of a Symmetric Tactical Air War," The RAND Corporation, RM-711, (August 1951). [14] Harris, K., and L. Wegner, and "Tactical Airpower in NATO Contingencies: A Joint Air- Battle/ Ground-Battle Model (TALLY/TOTEM)," The RAND Corporation, R-1194-PR, (May 1974). [15] Ho, Y. C, "Toward Generalized Control Theory," IEEE Transactions on Automatic Control, Vol. AC-14, 753-754 (1969). [16] Ho, Y. C, "Differential Games, Dynamic Optimization, and Generalized Control Theory," Journal of Optimization Theory and Applications 6, 179-209 (1970). [17] Howes, D., and R. Thrall, "A Theory of Ideal Linear Weights for Heterogeneous Combat Forces," Naval Research Logistics Quarterly 20, 645-659 (1973). [18] Isaacs, R., Differential Games (John Wiley, New York, 1965). [19] Karr, A., "Stochastic Attrition Models of Lanchester Type," P-1030, Institute for Defense Analyses, Arlington, Virginia, (June 1974). [20] Kawara, Y., "An Allocation Problem of Fire Support in Combat as a Differential Game," Operations Research^/, 942-951 (1973). [21] Kelley, H., R. Kopp, and H. Moyer. "Singular Extremals," in Topics in Optimization, G. Leitman (Ed.), pp. 63-101 (Academic Press, New York, 1967). [22] Lansdowne, Z., G. Dantzig, R. Harvey, and R. McKnight, "Development of an Algorithm to Solve Multi-State Games." Control Analysis Corporation, Palo Alto, California, (May 1973). [23] Lee, E., and M. Markus, Foundations of Optimal Control Theory, (John Wiley & Sons, Inc., New York, 1967). [24] Mangasarian, O., "Sufficient Conditions for the Optimal Control of Nonlinear Systems," SIAM Journal on Control 4, 139-152 (1966). [25] McNicholas, R., and F. Crane, "Guide to Fire Support Mix Evaluation Techniques, Volume I : The Guide and Appendices A and B " Stanford Research Institute, Menlo Park, California, (March 1973). [26] Moglewer, S., and C. Pajoie, "A Game Theory Approach to Logistics Allocation," Naval Research Logistics Quarterly 17, 87-97 (1970). [27] Morse, P., and G. Kimball, Methods of Operations Research, (The M.I.T. Press, Cambridge, Massachusetts, 1951). [28] Pontryagin, L., V. Boltyanskii, R. Gamkrehdze, and E. Mishchenko, The Mathematical Theory of Optimal Processes, (Interscience, New York, 1962). [29] Pugh, G., and J. Mayberry, "Theory of Measures of Effectiveness for General-Purpose Military Forces: Part I. A Zero-Sum Payoff Appropriate for Evaluating Combat Strategies," Operations Research ^i, 867-885 (1973). OPTIMAL FIRE-SUPPORT POLICIES 211 [30] Taylor, J., "On the Isbell and Marlow Fire Programming Problem," Naval Research Logistics Quarterly 19, 539-556 (1972). [31] Taylor, J., "Lanchester-Type Models of Warfare and Optimal Control," Naval Research Logistics Quarterly 21, 79-106 (1974). [32] Taylor, J., "Target Selection in Lanchester Combat: Linear-Law Attrition Process," Naval Research Logistics Quarterly 20, 673-697 (1973). [33] Taylor, J., "Target Selection in Lanchester Combat: Heterogeneous Forces and Time-Depend- ent Attrition-Rate Coefficients," Naval Research Logistics Quarterly 21, 683-704 (1974). [34] Taylor, J., "Solving Lanchester-Type Equations for 'Modern Warfare' with Variable Coeffi- cients," Operations Research ^^, 756-770 (1974). [35] Taylor, J., "On the Treatment of Force-Level Constraints in Time-Sequential Combat Prob- lems," Naval Rseearch Logistics Quarterly 22, 617-650 (1975). [36] Taylor, J. "On the Relationship Between the Force Ratio and the Instantaneous Casualty- Exchange Ratio for Some Lanchester-Type Models of Warfare," Naval Research Logistics Quarterly 2S, 345-352 (1976). [37] Taylor, J., "Determining the Class of Payoffs that Yield Force-Level-Independent Optimal Fire-Support Strategies," Operations Research 25, 506-516 (1977). [38] Taylor, J., and G. Brown, "Canonical Methods in the Solution of Variable-Coefficient Lanchester-Type Equations of Modern Warfare," Operations Research 2^, 44-69 (1976). [39] Taylor, J., and S. Parry, "Force-Ratio Considerations for Some Lanchester-Type Models of Warfare," Operations Research 28, 522-533 (1975). [40] Taylor, J., "Survey on the Optimal Control of Lanchester-Type Attrition Processes," pre- sented at the Symposium on the State-of-the-Art of Mathematics in Combat Models, (June 1973) (also Tech. Report NPS55Tw74031, Naval Postgraduate School, Monterey, California, March 1974) (AD 778 630). [41] Taylor, J., "Application of Differential Games to Problems of Military Conffict: Tactical Allo- cation Problems— Part II," Naval Postgraduate School Tech. Report No. NPS55Tw721lA. Monterey, California, (November 1972) (AD 758 663). [42] Taylor, J., "Application of Differential Games to Problems of Military Conffict: Tactical Allo- cation Problems— Part III," Naval Postgraduate School Tech. Report No. NPS55Tw74051, Monterey, California, (May 1974) (AD 782 304). [43] Taylor, J., "Appendices C and D of 'Application of Differential Games to Problems of Mili- tary Conffict: Tactical Allocation Problems — Part III'," Naval Postgraduate School Tech. Report No. NPS55Tw74112, Monterey, California, (November 1974) (AD A005 872). [44] Taylor, J., "Optimal Fire-Support Strategies," Naval Postgraduate School Tech. Report No. NPS55Tw76021, Monterey, California, (February 1976) (AD A033 761). [45] Taylor, J., and G. Brown, "An Examination of the Effects of the Criterion Functional on Opti- mal Fire-Support Pohcies," Naval Postgraduate School Tech. Report No. NPS55Tw76092, Monterey, California, (September 1976) (AD A033 760). [46] USAF Assistant Chief of Staff, Studies and Analysis, "Methodology for Use in Measuring the Effectiveness of General Purpose Forces, SABER GRAND (ALPHA)," (March 1971). [47] Weiss, H., "Lanchester-Type Models of Warfare," in Proc. First International Conj. Opera- tional Research pp. 82-98 (John Wiley & Sons, Inc., New York, 1957). [48] Weiss, H., "Some Differential Games of Tactical Interest and the Value of a Supporting Weapon System," Operations Research 7, 180-196 (1959). ft U. S. GOVERNMENT PRINTING OFFICE : 1978 261-252/2